19294 lines
932 KiB
Text
19294 lines
932 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_sl.py:549: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 424
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 424
|
|
ncols: 265
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 102
|
|
log10_or_mychisq 102
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 166
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 173
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data according to scaling law: 1/np.sqrt(x_ncols)
|
|
Train data size: (170, 173)
|
|
Test data size: 0.07602859212697055 (15, 173)
|
|
y_train numbers: Counter({1: 105, 0: 65})
|
|
y_train ratio: 0.6190476190476191
|
|
|
|
y_test_numbers: Counter({1: 9, 0: 6})
|
|
y_test ratio: 0.6666666666666666
|
|
-------------------------------------------------------------
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 105, 1: 105})
|
|
(210, 173)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 65, 1: 65})
|
|
(130, 173)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 105, 1: 105})
|
|
(210, 173)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 105, 1: 105})
|
|
(210, 173)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: scaling law split
|
|
Gene name: pncA
|
|
Drug name: pyrazinamide
|
|
|
|
Output directory: /home/tanu/git/Data/pyrazinamide/output/ml/tts_sl/
|
|
Sanity checks:
|
|
ML source data size: (185, 173)
|
|
Total input features: (170, 173)
|
|
Target feature numbers: Counter({1: 105, 0: 65})
|
|
Target features ratio: 0.6190476190476191
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 34
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.030159 0.03234315 0.06500244 0.0333209 0.03272486 0.04780293
|
|
0.03167248 0.15353703 0.06054592 0.03304076]
|
|
|
|
mean value: 0.052014946937561035
|
|
|
|
key: score_time
|
|
value: [0.01172686 0.01196408 0.02330327 0.01240492 0.01344848 0.01356339
|
|
0.01245809 0.01469016 0.01229882 0.01356554]
|
|
|
|
mean value: 0.013942360877990723
|
|
|
|
key: test_mcc
|
|
value: [0.63262663 0.66299354 0.77151675 0.38122129 0.66299354 0.63262663
|
|
0.04351941 0.33371191 0.60385964 0.30389487]
|
|
|
|
mean value: 0.5028964212212766
|
|
|
|
key: train_mcc
|
|
value: [0.83628052 0.87638923 0.80724696 0.80552514 0.7938003 0.76492233
|
|
0.79554375 0.82448293 0.83762196 0.79554375]
|
|
|
|
mean value: 0.8137356868892114
|
|
|
|
key: test_accuracy
|
|
value: [0.82352941 0.82352941 0.88235294 0.70588235 0.82352941 0.82352941
|
|
0.52941176 0.70588235 0.82352941 0.70588235]
|
|
|
|
mean value: 0.7647058823529411
|
|
|
|
key: train_accuracy
|
|
value: [0.92156863 0.94117647 0.90849673 0.90849673 0.90196078 0.88888889
|
|
0.90196078 0.91503268 0.92156863 0.90196078]
|
|
|
|
mean value: 0.9111111111111111
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.86956522 0.90909091 0.7826087 0.86956522 0.85714286
|
|
0.6 0.7826087 0.86956522 0.8 ]
|
|
|
|
mean value: 0.8197289666854884
|
|
|
|
key: train_fscore
|
|
value: [0.94 0.95431472 0.93 0.92929293 0.92537313 0.91370558
|
|
0.92462312 0.93467337 0.93939394 0.92462312]
|
|
|
|
mean value: 0.9315999905573705
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.76923077 0.83333333 0.69230769 0.76923077 0.9
|
|
0.66666667 0.75 0.83333333 0.71428571]
|
|
|
|
mean value: 0.7746570096570097
|
|
|
|
key: train_precision
|
|
value: [0.8952381 0.92156863 0.88571429 0.89320388 0.87735849 0.87378641
|
|
0.87619048 0.88571429 0.89423077 0.87619048]
|
|
|
|
mean value: 0.8879195797557542
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 1. 0.9 1. 0.81818182
|
|
0.54545455 0.81818182 0.90909091 0.90909091]
|
|
|
|
mean value: 0.88
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.98947368 0.97894737 0.96842105 0.97894737 0.95744681
|
|
0.9787234 0.9893617 0.9893617 0.9787234 ]
|
|
|
|
mean value: 0.9798880179171333
|
|
|
|
key: test_roc_auc
|
|
value: [0.80714286 0.78571429 0.85714286 0.66428571 0.78571429 0.82575758
|
|
0.52272727 0.65909091 0.78787879 0.62121212]
|
|
|
|
mean value: 0.7316666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.89990926 0.92577132 0.88602541 0.88938294 0.87740472 0.86855391
|
|
0.87919221 0.89298594 0.90146051 0.87919221]
|
|
|
|
mean value: 0.8899878429737624
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.76923077 0.83333333 0.64285714 0.76923077 0.75
|
|
0.42857143 0.64285714 0.76923077 0.66666667]
|
|
|
|
mean value: 0.7021978021978023
|
|
|
|
key: train_jcc
|
|
value: [0.88679245 0.91262136 0.86915888 0.86792453 0.86111111 0.8411215
|
|
0.85981308 0.87735849 0.88571429 0.85981308]
|
|
|
|
mean value: 0.8721428769802886
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.90216947 0.87260532 1.08087659 0.74459434 1.0994246 1.10648322
|
|
1.21706295 0.96376276 0.86980605 0.99782419]
|
|
|
|
mean value: 0.9854609489440918
|
|
|
|
key: score_time
|
|
value: [0.01360488 0.01396418 0.01334357 0.01384592 0.01390672 0.01344657
|
|
0.01328254 0.0135169 0.01473761 0.01464629]
|
|
|
|
mean value: 0.013829517364501952
|
|
|
|
key: test_mcc
|
|
value: [0.51428571 0.50920105 0.30988989 0.51428571 0.50920105 0.88273483
|
|
0.2030906 0.48484848 0.74242424 0.48484848]
|
|
|
|
mean value: 0.5154810071962633
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 0.91830889 0.98625704
|
|
0.90411865 1. 0.89069566 1. ]
|
|
|
|
mean value: 0.96993802415093
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.76470588 0.64705882 0.76470588 0.76470588 0.94117647
|
|
0.58823529 0.76470588 0.88235294 0.76470588]
|
|
|
|
mean value: 0.7647058823529411
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 0.96078431 0.99346405
|
|
0.95424837 1. 0.94771242 1. ]
|
|
|
|
mean value: 0.9856209150326798
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.81818182 0.66666667 0.8 0.81818182 0.95238095
|
|
0.63157895 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8032444748234222
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 0.96938776 0.99470899
|
|
0.96373057 1. 0.95876289 1. ]
|
|
|
|
mean value: 0.988659020635716
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 0.75 0.8 0.75 1.
|
|
0.75 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8145454545454546
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.94059406 0.98947368
|
|
0.93939394 1. 0.93 1. ]
|
|
|
|
mean value: 0.9799461683010406
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 0.6 0.8 0.9 0.90909091
|
|
0.54545455 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 0.9893617
|
|
1. 0.9893617 1. ]
|
|
|
|
mean value: 0.997872340425532
|
|
|
|
key: test_roc_auc
|
|
value: [0.75714286 0.73571429 0.65714286 0.75714286 0.73571429 0.95454545
|
|
0.60606061 0.74242424 0.87121212 0.74242424]
|
|
|
|
mean value: 0.755952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 0.94827586 0.99152542
|
|
0.94383339 1. 0.93535882 1. ]
|
|
|
|
mean value: 0.9818993496400015
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.69230769 0.5 0.66666667 0.69230769 0.90909091
|
|
0.46153846 0.69230769 0.83333333 0.69230769]
|
|
|
|
mean value: 0.6806526806526807
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 0.94059406 0.98947368
|
|
0.93 1. 0.92079208 1. ]
|
|
|
|
mean value: 0.9780859822824388
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01363158 0.01329041 0.00920653 0.01015902 0.00940776 0.00897503
|
|
0.00879431 0.00902987 0.01002741 0.00941849]
|
|
|
|
mean value: 0.010194039344787598
|
|
|
|
key: score_time
|
|
value: [0.01313639 0.01162434 0.00912452 0.01039958 0.00888896 0.00871277
|
|
0.00869632 0.00936913 0.00976944 0.00974512]
|
|
|
|
mean value: 0.009946656227111817
|
|
|
|
key: test_mcc
|
|
value: [ 0.38122129 0.50920105 0.77151675 0.13241022 0.24688536 -0.01899343
|
|
0.22727273 -0.01899343 0.17069719 0.17069719]
|
|
|
|
mean value: 0.25719149163785127
|
|
|
|
key: train_mcc
|
|
value: [0.5048764 0.47629849 0.5048764 0.45884418 0.48537027 0.43135777
|
|
0.49226514 0.47721276 0.44691625 0.39720759]
|
|
|
|
mean value: 0.46752252677089945
|
|
|
|
key: test_accuracy
|
|
value: [0.70588235 0.76470588 0.88235294 0.58823529 0.64705882 0.58823529
|
|
0.64705882 0.58823529 0.64705882 0.64705882]
|
|
|
|
mean value: 0.6705882352941177
|
|
|
|
key: train_accuracy
|
|
value: [0.77124183 0.75816993 0.77124183 0.75163399 0.76470588 0.71895425
|
|
0.76470588 0.75816993 0.74509804 0.68627451]
|
|
|
|
mean value: 0.7490196078431373
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.81818182 0.90909091 0.66666667 0.72727273 0.72
|
|
0.72727273 0.72 0.75 0.75 ]
|
|
|
|
mean value: 0.7571093544137023
|
|
|
|
key: train_fscore
|
|
value: [0.82233503 0.81218274 0.82233503 0.81 0.82524272 0.81385281
|
|
0.81818182 0.81407035 0.80597015 0.71084337]
|
|
|
|
mean value: 0.8055014016865908
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.75 0.83333333 0.63636364 0.66666667 0.64285714
|
|
0.72727273 0.64285714 0.69230769 0.69230769]
|
|
|
|
mean value: 0.6976273726273726
|
|
|
|
key: train_precision
|
|
value: [0.79411765 0.78431373 0.79411765 0.77142857 0.76576577 0.68613139
|
|
0.77884615 0.77142857 0.75700935 0.81944444]
|
|
|
|
mean value: 0.7722603259177057
|
|
|
|
key: test_recall
|
|
value: [0.9 0.9 1. 0.7 0.8 0.81818182
|
|
0.72727273 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8300000000000001
|
|
|
|
key: train_recall
|
|
value: [0.85263158 0.84210526 0.85263158 0.85263158 0.89473684 1.
|
|
0.86170213 0.86170213 0.86170213 0.62765957]
|
|
|
|
mean value: 0.8507502799552071
|
|
|
|
key: test_roc_auc
|
|
value: [0.66428571 0.73571429 0.85714286 0.56428571 0.61428571 0.49242424
|
|
0.61363636 0.49242424 0.57575758 0.57575758]
|
|
|
|
mean value: 0.6185714285714285
|
|
|
|
key: train_roc_auc
|
|
value: [0.74528131 0.73139746 0.74528131 0.71941924 0.72323049 0.63559322
|
|
0.73593581 0.72746123 0.71051208 0.7036603 ]
|
|
|
|
mean value: 0.7177772440103329
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.69230769 0.83333333 0.5 0.57142857 0.5625
|
|
0.57142857 0.5625 0.6 0.6 ]
|
|
|
|
mean value: 0.6136355311355312
|
|
|
|
key: train_jcc
|
|
value: [0.69827586 0.68376068 0.69827586 0.68067227 0.70247934 0.68613139
|
|
0.69230769 0.68644068 0.675 0.55140187]
|
|
|
|
mean value: 0.675474564194314
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00986075 0.01002407 0.00997305 0.01180744 0.01076865 0.01062441
|
|
0.01006889 0.00927901 0.01006889 0.00926876]
|
|
|
|
mean value: 0.010174393653869629
|
|
|
|
key: score_time
|
|
value: [0.00983071 0.00966644 0.00964904 0.01030135 0.01011753 0.00926757
|
|
0.00965786 0.00972462 0.00960636 0.00913143]
|
|
|
|
mean value: 0.009695291519165039
|
|
|
|
key: test_mcc
|
|
value: [ 0.23975611 0.38251843 0.63262663 0.02857143 0.50920105 0.2030906
|
|
-0.13241022 -0.28787879 0.13241022 0.17069719]
|
|
|
|
mean value: 0.18785826455207946
|
|
|
|
key: train_mcc
|
|
value: [0.42092813 0.39333516 0.32656704 0.46856319 0.39056476 0.41094842
|
|
0.47583844 0.38542713 0.43627743 0.37735366]
|
|
|
|
mean value: 0.40858033755597606
|
|
|
|
key: test_accuracy
|
|
value: [0.64705882 0.70588235 0.82352941 0.52941176 0.76470588 0.58823529
|
|
0.41176471 0.41176471 0.58823529 0.64705882]
|
|
|
|
mean value: 0.611764705882353
|
|
|
|
key: train_accuracy
|
|
value: [0.7254902 0.7124183 0.69281046 0.74509804 0.71895425 0.7254902
|
|
0.75163399 0.70588235 0.73202614 0.70588235]
|
|
|
|
mean value: 0.7215686274509804
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.76190476 0.85714286 0.6 0.81818182 0.63157895
|
|
0.44444444 0.54545455 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6825374041163514
|
|
|
|
key: train_fscore
|
|
value: [0.77659574 0.76595745 0.76616915 0.78918919 0.78172589 0.78350515
|
|
0.79787234 0.75675676 0.78074866 0.76190476]
|
|
|
|
mean value: 0.776042510006011
|
|
|
|
key: test_precision
|
|
value: [0.64285714 0.72727273 0.81818182 0.6 0.75 0.75
|
|
0.57142857 0.54545455 0.7 0.69230769]
|
|
|
|
mean value: 0.6797502497502498
|
|
|
|
key: train_precision
|
|
value: [0.78494624 0.77419355 0.72641509 0.81111111 0.75490196 0.76
|
|
0.79787234 0.76923077 0.78494624 0.75789474]
|
|
|
|
mean value: 0.772151203423883
|
|
|
|
key: test_recall
|
|
value: [0.9 0.8 0.9 0.6 0.9 0.54545455
|
|
0.36363636 0.54545455 0.63636364 0.81818182]
|
|
|
|
mean value: 0.7009090909090909
|
|
|
|
key: train_recall
|
|
value: [0.76842105 0.75789474 0.81052632 0.76842105 0.81052632 0.80851064
|
|
0.79787234 0.74468085 0.77659574 0.76595745]
|
|
|
|
mean value: 0.7809406494960806
|
|
|
|
key: test_roc_auc
|
|
value: [0.59285714 0.68571429 0.80714286 0.51428571 0.73571429 0.60606061
|
|
0.43181818 0.35606061 0.56818182 0.57575758]
|
|
|
|
mean value: 0.5873593073593074
|
|
|
|
key: train_roc_auc
|
|
value: [0.71179673 0.69791289 0.65526316 0.7376588 0.68974592 0.70086549
|
|
0.73791922 0.69437432 0.71880635 0.68806347]
|
|
|
|
mean value: 0.7032406345084143
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.61538462 0.75 0.42857143 0.69230769 0.46153846
|
|
0.28571429 0.375 0.5 0.6 ]
|
|
|
|
mean value: 0.5308516483516483
|
|
|
|
key: train_jcc
|
|
value: [0.63478261 0.62068966 0.62096774 0.65178571 0.64166667 0.6440678
|
|
0.66371681 0.60869565 0.64035088 0.61538462]
|
|
|
|
mean value: 0.6342108142276903
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00887656 0.01209188 0.0101974 0.00957799 0.00946403 0.00974488
|
|
0.0111022 0.00958228 0.01056123 0.01019073]
|
|
|
|
mean value: 0.010138916969299316
|
|
|
|
key: score_time
|
|
value: [0.05371165 0.02205682 0.01622033 0.01528668 0.01598525 0.01636815
|
|
0.01718068 0.01583552 0.01490521 0.01607275]
|
|
|
|
mean value: 0.020362305641174316
|
|
|
|
key: test_mcc
|
|
value: [ 0.13241022 -0.11769798 -0.46409548 -0.07377111 -0.38729833 0.13241022
|
|
-0.11948803 -0.28787879 0.06356417 -0.11769798]
|
|
|
|
mean value: -0.12395430776647315
|
|
|
|
key: train_mcc
|
|
value: [0.40852687 0.3435988 0.41056782 0.3789188 0.30157232 0.3679126
|
|
0.34836646 0.43470567 0.35262985 0.35371983]
|
|
|
|
mean value: 0.37005190349511996
|
|
|
|
key: test_accuracy
|
|
value: [0.58823529 0.47058824 0.35294118 0.52941176 0.41176471 0.58823529
|
|
0.52941176 0.41176471 0.58823529 0.47058824]
|
|
|
|
mean value: 0.49411764705882355
|
|
|
|
key: train_accuracy
|
|
value: [0.73202614 0.70588235 0.73202614 0.71895425 0.68627451 0.7124183
|
|
0.69934641 0.73856209 0.70588235 0.70588235]
|
|
|
|
mean value: 0.7137254901960784
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.57142857 0.52173913 0.66666667 0.58333333 0.66666667
|
|
0.66666667 0.54545455 0.69565217 0.57142857]
|
|
|
|
mean value: 0.6155702992659514
|
|
|
|
key: train_fscore
|
|
value: [0.80382775 0.79069767 0.8 0.79227053 0.76923077 0.79047619
|
|
0.76767677 0.7979798 0.784689 0.7826087 ]
|
|
|
|
mean value: 0.7879457173246753
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.54545455 0.46153846 0.57142857 0.5 0.7
|
|
0.61538462 0.54545455 0.66666667 0.6 ]
|
|
|
|
mean value: 0.5842291042291042
|
|
|
|
key: train_precision
|
|
value: [0.73684211 0.70833333 0.74545455 0.73214286 0.7079646 0.71551724
|
|
0.73076923 0.75961538 0.71304348 0.71681416]
|
|
|
|
mean value: 0.7266496937280635
|
|
|
|
key: test_recall
|
|
value: [0.7 0.6 0.6 0.8 0.7 0.63636364
|
|
0.72727273 0.54545455 0.72727273 0.54545455]
|
|
|
|
mean value: 0.6581818181818182
|
|
|
|
key: train_recall
|
|
value: [0.88421053 0.89473684 0.86315789 0.86315789 0.84210526 0.88297872
|
|
0.80851064 0.84042553 0.87234043 0.86170213]
|
|
|
|
mean value: 0.8613325867861142
|
|
|
|
key: test_roc_auc
|
|
value: [0.56428571 0.44285714 0.3 0.47142857 0.35 0.56818182
|
|
0.4469697 0.35606061 0.53030303 0.43939394]
|
|
|
|
mean value: 0.4469480519480519
|
|
|
|
key: train_roc_auc
|
|
value: [0.68348457 0.64564428 0.69019964 0.67295826 0.63656987 0.66182834
|
|
0.66696718 0.70834836 0.6565092 0.65966462]
|
|
|
|
mean value: 0.6682174330774522
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.4 0.35294118 0.5 0.41176471 0.5
|
|
0.5 0.375 0.53333333 0.4 ]
|
|
|
|
mean value: 0.44730392156862747
|
|
|
|
key: train_jcc
|
|
value: [0.672 0.65384615 0.66666667 0.656 0.625 0.65354331
|
|
0.62295082 0.66386555 0.64566929 0.64285714]
|
|
|
|
mean value: 0.6502398927685779
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01092124 0.01062822 0.01066852 0.0106132 0.01072216 0.01072073
|
|
0.0104661 0.01049161 0.01138711 0.01187062]
|
|
|
|
mean value: 0.01084895133972168
|
|
|
|
key: score_time
|
|
value: [0.00920558 0.0091362 0.00914645 0.00922441 0.00928783 0.00917053
|
|
0.00909495 0.00927687 0.00960684 0.00933766]
|
|
|
|
mean value: 0.009248733520507812
|
|
|
|
key: test_mcc
|
|
value: [ 0.29880715 0.43643578 0.29880715 0.09944903 0.06546537 0.11236664
|
|
-0.01899343 -0.01899343 0.3385016 0.11236664]
|
|
|
|
mean value: 0.17242125126907817
|
|
|
|
key: train_mcc
|
|
value: [0.52447344 0.48191696 0.45248357 0.59244006 0.56560446 0.58429818
|
|
0.58307945 0.59739548 0.51726562 0.57111391]
|
|
|
|
mean value: 0.5470071110789382
|
|
|
|
key: test_accuracy
|
|
value: [0.64705882 0.70588235 0.64705882 0.58823529 0.58823529 0.64705882
|
|
0.58823529 0.58823529 0.70588235 0.64705882]
|
|
|
|
mean value: 0.6352941176470589
|
|
|
|
key: train_accuracy
|
|
value: [0.76470588 0.74509804 0.73202614 0.79738562 0.78431373 0.79084967
|
|
0.79738562 0.79738562 0.75816993 0.78431373]
|
|
|
|
mean value: 0.7751633986928105
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.8 0.76923077 0.69565217 0.72 0.76923077
|
|
0.72 0.72 0.81481481 0.76923077]
|
|
|
|
mean value: 0.7547390065650935
|
|
|
|
key: train_fscore
|
|
value: [0.84070796 0.82969432 0.82251082 0.85972851 0.85201794 0.85454545
|
|
0.85581395 0.85844749 0.83555556 0.85067873]
|
|
|
|
mean value: 0.8459700739469289
|
|
|
|
key: test_precision
|
|
value: [0.625 0.66666667 0.625 0.61538462 0.6 0.66666667
|
|
0.64285714 0.64285714 0.6875 0.66666667]
|
|
|
|
mean value: 0.6438598901098901
|
|
|
|
key: train_precision
|
|
value: [0.72519084 0.70895522 0.69852941 0.75396825 0.7421875 0.74603175
|
|
0.76033058 0.752 0.71755725 0.74015748]
|
|
|
|
mean value: 0.7344908286075713
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.8 0.9 0.90909091
|
|
0.81818182 0.81818182 1. 0.90909091]
|
|
|
|
mean value: 0.9154545454545455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 0.9787234
|
|
1. 1. 1. ]
|
|
|
|
mean value: 0.9978723404255319
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.64285714 0.57142857 0.54285714 0.52142857 0.53787879
|
|
0.49242424 0.49242424 0.58333333 0.53787879]
|
|
|
|
mean value: 0.5493939393939393
|
|
|
|
key: train_roc_auc
|
|
value: [0.68965517 0.6637931 0.64655172 0.73275862 0.71551724 0.72881356
|
|
0.74359899 0.73728814 0.68644068 0.72033898]
|
|
|
|
mean value: 0.7064756208264422
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.66666667 0.625 0.53333333 0.5625 0.625
|
|
0.5625 0.5625 0.6875 0.625 ]
|
|
|
|
mean value: 0.6075
|
|
|
|
key: train_jcc
|
|
value: [0.72519084 0.70895522 0.69852941 0.75396825 0.7421875 0.74603175
|
|
0.74796748 0.752 0.71755725 0.74015748]
|
|
|
|
mean value: 0.7332545187238113
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.8853507 0.85311651 0.85744238 1.03879786 1.08934593 1.1350615
|
|
0.69020605 0.69074345 1.01780343 1.07197356]
|
|
|
|
mean value: 0.9329841375350952
|
|
|
|
key: score_time
|
|
value: [0.01643276 0.01357484 0.01391101 0.01479197 0.01400852 0.01294899
|
|
0.01254201 0.01256633 0.02390885 0.01315498]
|
|
|
|
mean value: 0.014784026145935058
|
|
|
|
key: test_mcc
|
|
value: [ 0.38251843 0.63262663 0.50920105 0.13241022 -0.01543033 0.69631062
|
|
0.29012943 0.17069719 0.74242424 -0.01899343]
|
|
|
|
mean value: 0.3521894047164522
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.70588235 0.82352941 0.76470588 0.58823529 0.52941176 0.82352941
|
|
0.64705882 0.64705882 0.88235294 0.58823529]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.85714286 0.81818182 0.66666667 0.63636364 0.84210526
|
|
0.7 0.75 0.90909091 0.72 ]
|
|
|
|
mean value: 0.7661455912508545
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.81818182 0.75 0.63636364 0.58333333 1.
|
|
0.77777778 0.69230769 0.90909091 0.64285714]
|
|
|
|
mean value: 0.7537185037185037
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 0.9 0.7 0.7 0.72727273
|
|
0.63636364 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.7909090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.68571429 0.80714286 0.73571429 0.56428571 0.49285714 0.86363636
|
|
0.65151515 0.57575758 0.87121212 0.49242424]
|
|
|
|
mean value: 0.6740259740259741
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.75 0.69230769 0.5 0.46666667 0.72727273
|
|
0.53846154 0.6 0.83333333 0.5625 ]
|
|
|
|
mean value: 0.6285926573426573
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.022614 0.01708698 0.01282835 0.01164603 0.01256561 0.01273346
|
|
0.01293755 0.0132103 0.01343513 0.01459908]
|
|
|
|
mean value: 0.014365649223327637
|
|
|
|
key: score_time
|
|
value: [0.02368164 0.00920272 0.0085516 0.0085299 0.00866437 0.00875759
|
|
0.00901079 0.0088861 0.00909567 0.00948334]
|
|
|
|
mean value: 0.010386371612548828
|
|
|
|
key: test_mcc
|
|
value: [0.38251843 0.77151675 0.7 0.51428571 0.88273483 0.88273483
|
|
1. 0.69631062 0.60385964 0.88273483]
|
|
|
|
mean value: 0.731669564240748
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.70588235 0.88235294 0.82352941 0.76470588 0.94117647 0.94117647
|
|
1. 0.82352941 0.82352941 0.94117647]
|
|
|
|
mean value: 0.8647058823529412
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.90909091 0.82352941 0.8 0.95238095 0.95238095
|
|
1. 0.84210526 0.86956522 0.95238095]
|
|
|
|
mean value: 0.8863338420452433
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.83333333 1. 0.8 0.90909091 1.
|
|
1. 1. 0.83333333 1. ]
|
|
|
|
mean value: 0.9103030303030303
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.7 0.8 1. 0.90909091
|
|
1. 0.72727273 0.90909091 0.90909091]
|
|
|
|
mean value: 0.8754545454545455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.68571429 0.85714286 0.85 0.75714286 0.92857143 0.95454545
|
|
1. 0.86363636 0.78787879 0.95454545]
|
|
|
|
mean value: 0.863917748917749
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.83333333 0.7 0.66666667 0.90909091 0.90909091
|
|
1. 0.72727273 0.76923077 0.90909091]
|
|
|
|
mean value: 0.8039160839160839
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10089493 0.09773612 0.09653449 0.09245515 0.0937891 0.09260631
|
|
0.09319448 0.09338927 0.09292459 0.09129429]
|
|
|
|
mean value: 0.09448187351226807
|
|
|
|
key: score_time
|
|
value: [0.01952076 0.01778722 0.01751566 0.01846743 0.01758528 0.01765728
|
|
0.01768947 0.01766562 0.01741314 0.01764417]
|
|
|
|
mean value: 0.017894601821899413
|
|
|
|
key: test_mcc
|
|
value: [0.38122129 0.77151675 0.50920105 0.27142857 0.24688536 0.63262663
|
|
0.22727273 0.11236664 0.60385964 0.22727273]
|
|
|
|
mean value: 0.3983651389885671
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.70588235 0.88235294 0.76470588 0.64705882 0.64705882 0.82352941
|
|
0.64705882 0.64705882 0.82352941 0.64705882]
|
|
|
|
mean value: 0.7235294117647059
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.90909091 0.81818182 0.7 0.72727273 0.85714286
|
|
0.72727273 0.76923077 0.86956522 0.72727273]
|
|
|
|
mean value: 0.7887638448508013
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.83333333 0.75 0.7 0.66666667 0.9
|
|
0.72727273 0.66666667 0.83333333 0.72727273]
|
|
|
|
mean value: 0.7496853146853146
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 0.9 0.7 0.8 0.81818182
|
|
0.72727273 0.90909091 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8390909090909091
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.66428571 0.85714286 0.73571429 0.63571429 0.61428571 0.82575758
|
|
0.61363636 0.53787879 0.78787879 0.61363636]
|
|
|
|
mean value: 0.6885930735930736
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.83333333 0.69230769 0.53846154 0.57142857 0.75
|
|
0.57142857 0.625 0.76923077 0.57142857]
|
|
|
|
mean value: 0.656547619047619
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00948 0.01067305 0.00918198 0.00951219 0.0090723 0.00886846
|
|
0.00886393 0.00893688 0.00908589 0.01140523]
|
|
|
|
mean value: 0.009507989883422852
|
|
|
|
key: score_time
|
|
value: [0.00905252 0.00954819 0.00903249 0.00884628 0.00869274 0.00862312
|
|
0.00868821 0.00864053 0.00882077 0.00961161]
|
|
|
|
mean value: 0.008955645561218261
|
|
|
|
key: test_mcc
|
|
value: [ 0.11769798 -0.27774603 0.27142857 0.13241022 0.38122129 0.22727273
|
|
0.33371191 0.38251843 0.29012943 0.22727273]
|
|
|
|
mean value: 0.20859172445585672
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.52941176 0.41176471 0.64705882 0.58823529 0.70588235 0.64705882
|
|
0.70588235 0.70588235 0.64705882 0.64705882]
|
|
|
|
mean value: 0.6235294117647059
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.54545455 0.7 0.66666667 0.7826087 0.72727273
|
|
0.7826087 0.76190476 0.7 0.72727273]
|
|
|
|
mean value: 0.6893788819875777
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.7 0.63636364 0.69230769 0.72727273
|
|
0.75 0.8 0.77777778 0.72727273]
|
|
|
|
mean value: 0.6977661227661227
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.4 0.6 0.7 0.7 0.9 0.72727273
|
|
0.81818182 0.72727273 0.63636364 0.72727273]
|
|
|
|
mean value: 0.6936363636363636
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.55714286 0.37142857 0.63571429 0.56428571 0.66428571 0.61363636
|
|
0.65909091 0.6969697 0.65151515 0.61363636]
|
|
|
|
mean value: 0.6027705627705628
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.375 0.53846154 0.5 0.64285714 0.57142857
|
|
0.64285714 0.61538462 0.53846154 0.57142857]
|
|
|
|
mean value: 0.5329212454212454
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.27949595 1.2522831 1.23157477 1.28226137 1.31405354 1.33238959
|
|
1.32696438 1.29168916 1.21863532 1.26388741]
|
|
|
|
mean value: 1.2793234586715698
|
|
|
|
key: score_time
|
|
value: [0.10485697 0.09505439 0.0964644 0.17082667 0.10032129 0.09584975
|
|
0.09731865 0.12909579 0.12827754 0.13281274]
|
|
|
|
mean value: 0.11508781909942627
|
|
|
|
key: test_mcc
|
|
value: [0.66299354 0.66299354 0.63262663 0.50920105 0.77151675 0.63262663
|
|
0.87400737 0.4608824 0.60385964 0.60385964]
|
|
|
|
mean value: 0.6414567202607909
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.82352941 0.82352941 0.82352941 0.76470588 0.88235294 0.82352941
|
|
0.94117647 0.76470588 0.82352941 0.82352941]
|
|
|
|
mean value: 0.8294117647058823
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.86956522 0.85714286 0.81818182 0.90909091 0.85714286
|
|
0.95652174 0.83333333 0.86956522 0.86956522]
|
|
|
|
mean value: 0.8709674383587427
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.76923077 0.81818182 0.75 0.83333333 0.9
|
|
0.91666667 0.76923077 0.83333333 0.83333333]
|
|
|
|
mean value: 0.8192540792540792
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.9 0.9 1. 0.81818182
|
|
1. 0.90909091 0.90909091 0.90909091]
|
|
|
|
mean value: 0.9345454545454546
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.78571429 0.80714286 0.73571429 0.85714286 0.82575758
|
|
0.91666667 0.70454545 0.78787879 0.78787879]
|
|
|
|
mean value: 0.7994155844155845
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.76923077 0.75 0.69230769 0.83333333 0.75
|
|
0.91666667 0.71428571 0.76923077 0.76923077]
|
|
|
|
mean value: 0.7733516483516484
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.94012594 0.93395567 0.88578105 0.94665599 1.04666948 1.01622057
|
|
1.05611706 1.00596952 1.00941062 1.69974065]
|
|
|
|
mean value: 1.154064655303955
|
|
|
|
key: score_time
|
|
value: [0.16018629 0.12835646 0.13689685 0.14965868 0.14444923 0.15999627
|
|
0.15356684 0.15862679 0.17733407 0.12240601]
|
|
|
|
mean value: 0.14914774894714355
|
|
|
|
key: test_mcc
|
|
value: [0.55328334 0.66299354 0.77151675 0.50920105 0.66299354 0.63262663
|
|
0.4608824 0.4608824 0.87400737 0.62678317]
|
|
|
|
mean value: 0.6215170201634855
|
|
|
|
key: train_mcc
|
|
value: [0.87638923 0.88986734 0.88986734 0.93172069 0.88986734 0.8640452
|
|
0.87733952 0.89069566 0.89069566 0.87733952]
|
|
|
|
mean value: 0.8877827514166594
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.82352941 0.88235294 0.76470588 0.82352941 0.82352941
|
|
0.76470588 0.76470588 0.94117647 0.82352941]
|
|
|
|
mean value: 0.8176470588235294
|
|
|
|
key: train_accuracy
|
|
value: [0.94117647 0.94771242 0.94771242 0.96732026 0.94771242 0.93464052
|
|
0.94117647 0.94771242 0.94771242 0.94117647]
|
|
|
|
mean value: 0.9464052287581699
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.86956522 0.90909091 0.81818182 0.86956522 0.85714286
|
|
0.83333333 0.83333333 0.95652174 0.88 ]
|
|
|
|
mean value: 0.8660067758328628
|
|
|
|
key: train_fscore
|
|
value: [0.95431472 0.95918367 0.95918367 0.97435897 0.95918367 0.94897959
|
|
0.95384615 0.95876289 0.95876289 0.95384615]
|
|
|
|
mean value: 0.9580422388304239
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.76923077 0.83333333 0.75 0.76923077 0.9
|
|
0.76923077 0.76923077 0.91666667 0.78571429]
|
|
|
|
mean value: 0.7976923076923077
|
|
|
|
key: train_precision
|
|
value: [0.92156863 0.93069307 0.93069307 0.95 0.93069307 0.91176471
|
|
0.92079208 0.93 0.93 0.92079208]
|
|
|
|
mean value: 0.9276996699669967
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.9 1. 0.81818182
|
|
0.90909091 0.90909091 1. 1. ]
|
|
|
|
mean value: 0.9536363636363636
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.98947368 0.98947368 1. 0.98947368 0.9893617
|
|
0.9893617 0.9893617 0.9893617 0.9893617 ]
|
|
|
|
mean value: 0.9904703247480403
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.78571429 0.85714286 0.73571429 0.78571429 0.82575758
|
|
0.70454545 0.70454545 0.91666667 0.75 ]
|
|
|
|
mean value: 0.778008658008658
|
|
|
|
key: train_roc_auc
|
|
value: [0.92577132 0.93439201 0.93439201 0.95689655 0.93439201 0.91840966
|
|
0.92688424 0.93535882 0.93535882 0.92688424]
|
|
|
|
mean value: 0.9328739700888068
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.76923077 0.83333333 0.69230769 0.76923077 0.75
|
|
0.71428571 0.71428571 0.91666667 0.78571429]
|
|
|
|
mean value: 0.765934065934066
|
|
|
|
key: train_jcc
|
|
value: [0.91262136 0.92156863 0.92156863 0.95 0.92156863 0.90291262
|
|
0.91176471 0.92079208 0.92079208 0.91176471]
|
|
|
|
mean value: 0.9195353433116012
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01290655 0.01201439 0.0091145 0.00908995 0.00977802 0.00959659
|
|
0.00912237 0.0090313 0.00955582 0.00962853]
|
|
|
|
mean value: 0.00998380184173584
|
|
|
|
key: score_time
|
|
value: [0.01208782 0.00901628 0.00966477 0.0090487 0.00951076 0.00898027
|
|
0.00883889 0.00868464 0.00920987 0.00892973]
|
|
|
|
mean value: 0.009397172927856445
|
|
|
|
key: test_mcc
|
|
value: [ 0.23975611 0.38251843 0.63262663 0.02857143 0.50920105 0.2030906
|
|
-0.13241022 -0.28787879 0.13241022 0.17069719]
|
|
|
|
mean value: 0.18785826455207946
|
|
|
|
key: train_mcc
|
|
value: [0.42092813 0.39333516 0.32656704 0.46856319 0.39056476 0.41094842
|
|
0.47583844 0.38542713 0.43627743 0.37735366]
|
|
|
|
mean value: 0.40858033755597606
|
|
|
|
key: test_accuracy
|
|
value: [0.64705882 0.70588235 0.82352941 0.52941176 0.76470588 0.58823529
|
|
0.41176471 0.41176471 0.58823529 0.64705882]
|
|
|
|
mean value: 0.611764705882353
|
|
|
|
key: train_accuracy
|
|
value: [0.7254902 0.7124183 0.69281046 0.74509804 0.71895425 0.7254902
|
|
0.75163399 0.70588235 0.73202614 0.70588235]
|
|
|
|
mean value: 0.7215686274509804
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.76190476 0.85714286 0.6 0.81818182 0.63157895
|
|
0.44444444 0.54545455 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6825374041163514
|
|
|
|
key: train_fscore
|
|
value: [0.77659574 0.76595745 0.76616915 0.78918919 0.78172589 0.78350515
|
|
0.79787234 0.75675676 0.78074866 0.76190476]
|
|
|
|
mean value: 0.776042510006011
|
|
|
|
key: test_precision
|
|
value: [0.64285714 0.72727273 0.81818182 0.6 0.75 0.75
|
|
0.57142857 0.54545455 0.7 0.69230769]
|
|
|
|
mean value: 0.6797502497502498
|
|
|
|
key: train_precision
|
|
value: [0.78494624 0.77419355 0.72641509 0.81111111 0.75490196 0.76
|
|
0.79787234 0.76923077 0.78494624 0.75789474]
|
|
|
|
mean value: 0.772151203423883
|
|
|
|
key: test_recall
|
|
value: [0.9 0.8 0.9 0.6 0.9 0.54545455
|
|
0.36363636 0.54545455 0.63636364 0.81818182]
|
|
|
|
mean value: 0.7009090909090909
|
|
|
|
key: train_recall
|
|
value: [0.76842105 0.75789474 0.81052632 0.76842105 0.81052632 0.80851064
|
|
0.79787234 0.74468085 0.77659574 0.76595745]
|
|
|
|
mean value: 0.7809406494960806
|
|
|
|
key: test_roc_auc
|
|
value: [0.59285714 0.68571429 0.80714286 0.51428571 0.73571429 0.60606061
|
|
0.43181818 0.35606061 0.56818182 0.57575758]
|
|
|
|
mean value: 0.5873593073593074
|
|
|
|
key: train_roc_auc
|
|
value: [0.71179673 0.69791289 0.65526316 0.7376588 0.68974592 0.70086549
|
|
0.73791922 0.69437432 0.71880635 0.68806347]
|
|
|
|
mean value: 0.7032406345084143
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.61538462 0.75 0.42857143 0.69230769 0.46153846
|
|
0.28571429 0.375 0.5 0.6 ]
|
|
|
|
mean value: 0.5308516483516483
|
|
|
|
key: train_jcc
|
|
value: [0.63478261 0.62068966 0.62096774 0.65178571 0.64166667 0.6440678
|
|
0.66371681 0.60869565 0.64035088 0.61538462]
|
|
|
|
mean value: 0.6342108142276903
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.95170927 1.08157611 4.2941637 1.35887098 1.44883847 1.49672079
|
|
1.49727535 1.45497322 4.00354838 5.43425512]
|
|
|
|
mean value: 2.3021931409835816
|
|
|
|
key: score_time
|
|
value: [0.01140547 0.05233741 0.01315928 0.01244068 0.0132606 0.01195669
|
|
0.01259422 0.01315713 0.02558994 0.01518345]
|
|
|
|
mean value: 0.018108487129211426
|
|
|
|
key: test_mcc
|
|
value: [0.38122129 0.66299354 0.88741197 0.63262663 0.75714286 0.87400737
|
|
1. 0.78334945 0.87400737 0.88273483]
|
|
|
|
mean value: 0.7735495312682757
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.70588235 0.82352941 0.94117647 0.82352941 0.88235294 0.94117647
|
|
1. 0.88235294 0.94117647 0.94117647]
|
|
|
|
mean value: 0.888235294117647
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.86956522 0.94736842 0.85714286 0.9 0.95652174
|
|
1. 0.9 0.95652174 0.95238095]
|
|
|
|
mean value: 0.912210962188079
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.76923077 1. 0.81818182 0.9 0.91666667
|
|
1. 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.9013053613053613
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 0.9 0.9 0.9 1.
|
|
1. 0.81818182 1. 0.90909091]
|
|
|
|
mean value: 0.9327272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.66428571 0.78571429 0.95 0.80714286 0.87857143 0.91666667
|
|
1. 0.90909091 0.91666667 0.95454545]
|
|
|
|
mean value: 0.8782683982683983
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.76923077 0.9 0.75 0.81818182 0.91666667
|
|
1. 0.81818182 0.91666667 0.90909091]
|
|
|
|
mean value: 0.8440875790875791
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04736233 0.06068015 0.0650456 0.06523633 0.05861163 0.05586934
|
|
0.0611794 0.06430578 0.06066442 0.05646658]
|
|
|
|
mean value: 0.059542155265808104
|
|
|
|
key: score_time
|
|
value: [0.0297966 0.0236876 0.01204276 0.02082038 0.02062225 0.01970601
|
|
0.02349544 0.0230937 0.02011728 0.02406526]
|
|
|
|
mean value: 0.021744728088378906
|
|
|
|
key: test_mcc
|
|
value: [0.51428571 0.63262663 0.30988989 0.50920105 0.07042952 0.74242424
|
|
0.53673944 0.4608824 0.2030906 0.78334945]
|
|
|
|
mean value: 0.4762918944486056
|
|
|
|
key: train_mcc
|
|
value: [0.95830113 0.94445829 0.98616507 0.95830113 1. 0.94483888
|
|
0.98625704 0.98625704 1. 0.97261224]
|
|
|
|
mean value: 0.9737190818635186
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.82352941 0.64705882 0.76470588 0.52941176 0.88235294
|
|
0.76470588 0.76470588 0.58823529 0.88235294]
|
|
|
|
mean value: 0.7411764705882353
|
|
|
|
key: train_accuracy
|
|
value: [0.98039216 0.97385621 0.99346405 0.98039216 1. 0.97385621
|
|
0.99346405 0.99346405 1. 0.9869281 ]
|
|
|
|
mean value: 0.9875816993464053
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.85714286 0.66666667 0.81818182 0.55555556 0.90909091
|
|
0.8 0.83333333 0.63157895 0.9 ]
|
|
|
|
mean value: 0.777155008733956
|
|
|
|
key: train_fscore
|
|
value: [0.98429319 0.97916667 0.9947644 0.98429319 1. 0.97894737
|
|
0.99470899 0.99470899 1. 0.98947368]
|
|
|
|
mean value: 0.9900356494056549
|
|
|
|
key: test_precision
|
|
value: [0.8 0.81818182 0.75 0.75 0.625 0.90909091
|
|
0.88888889 0.76923077 0.75 1. ]
|
|
|
|
mean value: 0.8060392385392385
|
|
|
|
key: train_precision
|
|
value: [0.97916667 0.96907216 0.98958333 0.97916667 1. 0.96875
|
|
0.98947368 0.98947368 1. 0.97916667]
|
|
|
|
mean value: 0.9843852866702839
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 0.6 0.9 0.5 0.90909091
|
|
0.72727273 0.90909091 0.54545455 0.81818182]
|
|
|
|
mean value: 0.7609090909090909
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.98947368 1. 0.98947368 1. 0.9893617
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9957782754759239
|
|
|
|
key: test_roc_auc
|
|
value: [0.75714286 0.80714286 0.65714286 0.73571429 0.53571429 0.87121212
|
|
0.78030303 0.70454545 0.60606061 0.90909091]
|
|
|
|
mean value: 0.7364069264069264
|
|
|
|
key: train_roc_auc
|
|
value: [0.97749546 0.96887477 0.99137931 0.97749546 1. 0.96925712
|
|
0.99152542 0.99152542 1. 0.98305085]
|
|
|
|
mean value: 0.9850603826239935
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.75 0.5 0.69230769 0.38461538 0.83333333
|
|
0.66666667 0.71428571 0.46153846 0.81818182]
|
|
|
|
mean value: 0.6487595737595737
|
|
|
|
key: train_jcc
|
|
value: [0.96907216 0.95918367 0.98958333 0.96907216 1. 0.95876289
|
|
0.98947368 0.98947368 1. 0.97916667]
|
|
|
|
mean value: 0.9803788258385285
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02010775 0.00967646 0.00920439 0.00903702 0.00907183 0.00884581
|
|
0.00869846 0.0089879 0.0096581 0.00971985]
|
|
|
|
mean value: 0.010300755500793457
|
|
|
|
key: score_time
|
|
value: [0.01351142 0.0090692 0.00885296 0.00862312 0.00895929 0.00884748
|
|
0.00848365 0.00872898 0.00927663 0.00903773]
|
|
|
|
mean value: 0.009339046478271485
|
|
|
|
key: test_mcc
|
|
value: [ 0.55328334 0.50920105 0.66299354 0.02857143 0.24688536 0.22727273
|
|
0.04351941 -0.11948803 0.33371191 0.49441323]
|
|
|
|
mean value: 0.29803639728108056
|
|
|
|
key: train_mcc
|
|
value: [0.42621329 0.33692443 0.38059794 0.36668738 0.39480728 0.43071005
|
|
0.43299259 0.3747783 0.37096514 0.38834821]
|
|
|
|
mean value: 0.3903024610468177
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.76470588 0.82352941 0.52941176 0.64705882 0.64705882
|
|
0.52941176 0.52941176 0.70588235 0.76470588]
|
|
|
|
mean value: 0.6705882352941176
|
|
|
|
key: train_accuracy
|
|
value: [0.73856209 0.69934641 0.71895425 0.7124183 0.7254902 0.73856209
|
|
0.73856209 0.7124183 0.7124183 0.71895425]
|
|
|
|
mean value: 0.7215686274509804
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.81818182 0.86956522 0.6 0.72727273 0.72727273
|
|
0.6 0.66666667 0.7826087 0.84615385]
|
|
|
|
mean value: 0.7471055031924597
|
|
|
|
key: train_fscore
|
|
value: [0.80392157 0.7745098 0.7902439 0.78431373 0.7961165 0.80392157
|
|
0.8 0.78 0.78431373 0.78606965]
|
|
|
|
mean value: 0.790341045119155
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.75 0.76923077 0.6 0.66666667 0.72727273
|
|
0.66666667 0.61538462 0.75 0.73333333]
|
|
|
|
mean value: 0.6992840492840493
|
|
|
|
key: train_precision
|
|
value: [0.75229358 0.72477064 0.73636364 0.73394495 0.73873874 0.74545455
|
|
0.75471698 0.73584906 0.72727273 0.73831776]
|
|
|
|
mean value: 0.7387722616886769
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 1. 0.6 0.8 0.72727273
|
|
0.54545455 0.72727273 0.81818182 1. ]
|
|
|
|
mean value: 0.8118181818181818
|
|
|
|
key: train_recall
|
|
value: [0.86315789 0.83157895 0.85263158 0.84210526 0.86315789 0.87234043
|
|
0.85106383 0.82978723 0.85106383 0.84042553]
|
|
|
|
mean value: 0.8497312430011198
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.73571429 0.78571429 0.51428571 0.61428571 0.61363636
|
|
0.52272727 0.4469697 0.65909091 0.66666667]
|
|
|
|
mean value: 0.6273376623376623
|
|
|
|
key: train_roc_auc
|
|
value: [0.69882033 0.65716878 0.67631579 0.67105263 0.68157895 0.69888208
|
|
0.70519293 0.67760548 0.67129463 0.68292463]
|
|
|
|
mean value: 0.682083622669467
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.69230769 0.76923077 0.42857143 0.57142857 0.57142857
|
|
0.42857143 0.5 0.64285714 0.73333333]
|
|
|
|
mean value: 0.6052014652014652
|
|
|
|
key: train_jcc
|
|
value: [0.67213115 0.632 0.65322581 0.64516129 0.66129032 0.67213115
|
|
0.66666667 0.63934426 0.64516129 0.64754098]
|
|
|
|
mean value: 0.6534652917327692
|
|
|
|
MCC on Blind test: -0.07
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01129508 0.01568604 0.01332641 0.03448272 0.01468325 0.01636481
|
|
0.03756166 0.05245757 0.02793503 0.01410508]
|
|
|
|
mean value: 0.02378976345062256
|
|
|
|
key: score_time
|
|
value: [0.00920486 0.01116633 0.01118207 0.02200365 0.01170421 0.01170301
|
|
0.0180769 0.02105355 0.01168871 0.01157594]
|
|
|
|
mean value: 0.01393592357635498
|
|
|
|
key: test_mcc
|
|
value: [0.51428571 0.55328334 0.63262663 0.38122129 0.29880715 0.88273483
|
|
0.49441323 0.33371191 0.47673129 0.30389487]
|
|
|
|
mean value: 0.4871710250874902
|
|
|
|
key: train_mcc
|
|
value: [0.90340823 0.86531409 0.86241574 0.83628052 0.65815792 0.95883964
|
|
0.51726562 0.87413232 0.61903367 0.73827438]
|
|
|
|
mean value: 0.783312212337401
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.76470588 0.82352941 0.70588235 0.64705882 0.94117647
|
|
0.76470588 0.70588235 0.64705882 0.70588235]
|
|
|
|
mean value: 0.7470588235294118
|
|
|
|
key: train_accuracy
|
|
value: [0.95424837 0.93464052 0.93464052 0.92156863 0.83006536 0.98039216
|
|
0.75816993 0.93464052 0.76470588 0.86928105]
|
|
|
|
mean value: 0.888235294117647
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.83333333 0.85714286 0.7826087 0.76923077 0.95238095
|
|
0.84615385 0.7826087 0.625 0.8 ]
|
|
|
|
mean value: 0.8048459149546106
|
|
|
|
key: train_fscore
|
|
value: [0.96410256 0.95 0.94680851 0.94 0.87962963 0.98395722
|
|
0.83555556 0.94382022 0.76315789 0.90384615]
|
|
|
|
mean value: 0.9110877752479482
|
|
|
|
key: test_precision
|
|
value: [0.8 0.71428571 0.81818182 0.69230769 0.625 1.
|
|
0.73333333 0.75 1. 0.71428571]
|
|
|
|
mean value: 0.7847394272394272
|
|
|
|
key: train_precision
|
|
value: [0.94 0.9047619 0.95698925 0.8952381 0.78512397 0.98924731
|
|
0.71755725 1. 1. 0.8245614 ]
|
|
|
|
mean value: 0.9013479181499102
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.9 0.9 1. 0.90909091
|
|
1. 0.81818182 0.45454545 0.90909091]
|
|
|
|
mean value: 0.8690909090909091
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.93684211 0.98947368 1. 0.9787234
|
|
1. 0.89361702 0.61702128 1. ]
|
|
|
|
mean value: 0.940515117581187
|
|
|
|
key: test_roc_auc
|
|
value: [0.75714286 0.71428571 0.80714286 0.66428571 0.57142857 0.95454545
|
|
0.66666667 0.65909091 0.72727273 0.62121212]
|
|
|
|
mean value: 0.7143073593073593
|
|
|
|
key: train_roc_auc
|
|
value: [0.9430127 0.9137931 0.93393829 0.89990926 0.77586207 0.98088713
|
|
0.68644068 0.94680851 0.80851064 0.83050847]
|
|
|
|
mean value: 0.8719670853832294
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.71428571 0.75 0.64285714 0.625 0.90909091
|
|
0.73333333 0.64285714 0.45454545 0.66666667]
|
|
|
|
mean value: 0.680530303030303
|
|
|
|
key: train_jcc
|
|
value: [0.93069307 0.9047619 0.8989899 0.88679245 0.78512397 0.96842105
|
|
0.71755725 0.89361702 0.61702128 0.8245614 ]
|
|
|
|
mean value: 0.842753929875216
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01497602 0.02522302 0.03343582 0.03448534 0.02283978 0.03451777
|
|
0.03477836 0.03254557 0.01978588 0.01348734]
|
|
|
|
mean value: 0.026607489585876463
|
|
|
|
key: score_time
|
|
value: [0.01177239 0.01971555 0.02048707 0.01987171 0.02130342 0.01973534
|
|
0.03330612 0.02216315 0.02278042 0.01166964]
|
|
|
|
mean value: 0.02028048038482666
|
|
|
|
key: test_mcc
|
|
value: [0.36780618 0.66299354 0.36780618 0.38122129 0.06546537 0.88273483
|
|
0.26967994 0.2030906 0.4608824 0.3385016 ]
|
|
|
|
mean value: 0.4000181931146463
|
|
|
|
key: train_mcc
|
|
value: [0.68055705 0.55202478 0.80732775 0.84960093 0.48191696 0.83778301
|
|
0.74694017 0.71803726 0.78917952 0.36822985]
|
|
|
|
mean value: 0.6831597278235293
|
|
|
|
key: test_accuracy
|
|
value: [0.64705882 0.82352941 0.64705882 0.70588235 0.58823529 0.94117647
|
|
0.47058824 0.58823529 0.76470588 0.70588235]
|
|
|
|
mean value: 0.6882352941176471
|
|
|
|
key: train_accuracy
|
|
value: [0.81045752 0.77777778 0.89542484 0.92810458 0.74509804 0.92156863
|
|
0.85620915 0.83660131 0.89542484 0.69281046]
|
|
|
|
mean value: 0.8359477124183007
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.86956522 0.625 0.7826087 0.72 0.95238095
|
|
0.30769231 0.63157895 0.83333333 0.81481481]
|
|
|
|
mean value: 0.7161974268633308
|
|
|
|
key: train_fscore
|
|
value: [0.81987578 0.84821429 0.90804598 0.94472362 0.82969432 0.93478261
|
|
0.86746988 0.84662577 0.92156863 0.8 ]
|
|
|
|
mean value: 0.8721000862893723
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.76923077 0.83333333 0.69230769 0.6 1.
|
|
1. 0.75 0.76923077 0.6875 ]
|
|
|
|
mean value: 0.7934935897435897
|
|
|
|
key: train_precision
|
|
value: [1. 0.73643411 1. 0.90384615 0.70895522 0.95555556
|
|
1. 1. 0.85454545 0.66666667]
|
|
|
|
mean value: 0.882600316302156
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 0.5 0.9 0.9 0.90909091
|
|
0.18181818 0.54545455 0.90909091 1. ]
|
|
|
|
mean value: 0.7345454545454545
|
|
|
|
key: train_recall
|
|
value: [0.69473684 1. 0.83157895 0.98947368 1. 0.91489362
|
|
0.76595745 0.73404255 1. 1. ]
|
|
|
|
mean value: 0.8930683090705487
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.78571429 0.67857143 0.66428571 0.52142857 0.95454545
|
|
0.59090909 0.60606061 0.70454545 0.58333333]
|
|
|
|
mean value: 0.6767965367965368
|
|
|
|
key: train_roc_auc
|
|
value: [0.84736842 0.70689655 0.91578947 0.90852995 0.6637931 0.9235485
|
|
0.88297872 0.86702128 0.86440678 0.60169492]
|
|
|
|
mean value: 0.8182027693803942
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.76923077 0.45454545 0.64285714 0.5625 0.90909091
|
|
0.18181818 0.46153846 0.71428571 0.6875 ]
|
|
|
|
mean value: 0.5837912087912088
|
|
|
|
key: train_jcc
|
|
value: [0.69473684 0.73643411 0.83157895 0.8952381 0.70895522 0.87755102
|
|
0.76595745 0.73404255 0.85454545 0.66666667]
|
|
|
|
mean value: 0.7765706358739792
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.13799787 0.19559717 0.13715434 0.13844132 0.13842058 0.20680475
|
|
0.13890123 0.13737702 0.13736558 0.13849258]
|
|
|
|
mean value: 0.15065524578094483
|
|
|
|
key: score_time
|
|
value: [0.02109694 0.02060199 0.02058935 0.02053523 0.02055168 0.02046013
|
|
0.02037907 0.02750397 0.02053142 0.02038026]
|
|
|
|
mean value: 0.0212630033493042
|
|
|
|
key: test_mcc
|
|
value: [0.50920105 0.66299354 0.7 0.63262663 0.54935027 0.53673944
|
|
1. 0.63262663 0.60385964 1. ]
|
|
|
|
mean value: 0.6827397199255986
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.82352941 0.82352941 0.82352941 0.76470588 0.76470588
|
|
1. 0.82352941 0.82352941 1. ]
|
|
|
|
mean value: 0.8411764705882353
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.81818182 0.86956522 0.82352941 0.85714286 0.77777778 0.8
|
|
1. 0.85714286 0.86956522 1. ]
|
|
|
|
mean value: 0.8672905156792625
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.76923077 1. 0.81818182 0.875 0.88888889
|
|
1. 0.9 0.83333333 1. ]
|
|
|
|
mean value: 0.8834634809634809
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 0.7 0.9 0.7 0.72727273
|
|
1. 0.81818182 0.90909091 1. ]
|
|
|
|
mean value: 0.8654545454545455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.73571429 0.78571429 0.85 0.80714286 0.77857143 0.78030303
|
|
1. 0.82575758 0.78787879 1. ]
|
|
|
|
mean value: 0.8351082251082251
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.69230769 0.76923077 0.7 0.75 0.63636364 0.66666667
|
|
1. 0.75 0.76923077 1. ]
|
|
|
|
mean value: 0.7733799533799534
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08164883 0.06786728 0.0824008 0.08094072 0.07851148 0.07243514
|
|
0.07851815 0.07457829 0.06393242 0.07801318]
|
|
|
|
mean value: 0.07588462829589844
|
|
|
|
key: score_time
|
|
value: [0.02787757 0.02429485 0.02658248 0.0281651 0.02558804 0.02345562
|
|
0.02348995 0.02183986 0.02336168 0.02322936]
|
|
|
|
mean value: 0.024788451194763184
|
|
|
|
key: test_mcc
|
|
value: [0.66299354 0.77151675 0.78881064 0.63262663 0.75714286 0.87400737
|
|
1. 0.78334945 0.74242424 0.88273483]
|
|
|
|
mean value: 0.7895606313848224
|
|
|
|
key: train_mcc
|
|
value: [0.98625704 1. 0.95857961 1. 0.9722323 1.
|
|
0.97241255 0.98625704 0.95857961 0.98625704]
|
|
|
|
mean value: 0.9820575189370783
|
|
|
|
key: test_accuracy
|
|
value: [0.82352941 0.88235294 0.88235294 0.82352941 0.88235294 0.94117647
|
|
1. 0.88235294 0.88235294 0.94117647]
|
|
|
|
mean value: 0.8941176470588235
|
|
|
|
key: train_accuracy
|
|
value: [0.99346405 1. 0.98039216 1. 0.9869281 1.
|
|
0.9869281 0.99346405 0.98039216 0.99346405]
|
|
|
|
mean value: 0.9915032679738562
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.90909091 0.88888889 0.85714286 0.9 0.95652174
|
|
1. 0.9 0.90909091 0.95238095]
|
|
|
|
mean value: 0.9142681473116256
|
|
|
|
key: train_fscore
|
|
value: [0.99470899 1. 0.98412698 1. 0.98947368 1.
|
|
0.9893617 0.99470899 0.98412698 0.99470899]
|
|
|
|
mean value: 0.9931216338719138
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.83333333 1. 0.81818182 0.9 0.91666667
|
|
1. 1. 0.90909091 1. ]
|
|
|
|
mean value: 0.9146503496503496
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.9893617 1. 0.98947368 1.
|
|
0.9893617 0.98947368 0.97894737 0.98947368]
|
|
|
|
mean value: 0.9926091825307951
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.8 0.9 0.9 1.
|
|
1. 0.81818182 0.90909091 0.90909091]
|
|
|
|
mean value: 0.9236363636363636
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.97894737 1. 0.98947368 1.
|
|
0.9893617 1. 0.9893617 1. ]
|
|
|
|
mean value: 0.9936618141097424
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.85714286 0.9 0.80714286 0.87857143 0.91666667
|
|
1. 0.90909091 0.87121212 0.95454545]
|
|
|
|
mean value: 0.888008658008658
|
|
|
|
key: train_roc_auc
|
|
value: [0.99473684 1. 0.98085299 1. 0.98611615 1.
|
|
0.98620627 0.99152542 0.9777317 0.99152542]
|
|
|
|
mean value: 0.9908694809882435
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.83333333 0.8 0.75 0.81818182 0.91666667
|
|
1. 0.81818182 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8448018648018648
|
|
|
|
key: train_jcc
|
|
value: [0.98947368 1. 0.96875 1. 0.97916667 1.
|
|
0.97894737 0.98947368 0.96875 0.98947368]
|
|
|
|
mean value: 0.9864035087719298
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08835268 0.09343934 0.08267331 0.09665108 0.07296014 0.09550357
|
|
0.10921788 0.09168172 0.07537723 0.08310604]
|
|
|
|
mean value: 0.0888962984085083
|
|
|
|
key: score_time
|
|
value: [0.03113747 0.02139211 0.03570366 0.02917075 0.02335262 0.03179145
|
|
0.03920627 0.03474498 0.03395534 0.02556109]
|
|
|
|
mean value: 0.03060157299041748
|
|
|
|
key: test_mcc
|
|
value: [ 0.13241022 0.38251843 0.23975611 -0.27774603 -0.18232322 0.33371191
|
|
0.04351941 0.11236664 0.4608824 0.33371191]
|
|
|
|
mean value: 0.1578807778938489
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.58823529 0.70588235 0.64705882 0.41176471 0.47058824 0.70588235
|
|
0.52941176 0.64705882 0.76470588 0.70588235]
|
|
|
|
mean value: 0.6176470588235294
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.76190476 0.75 0.54545455 0.60869565 0.7826087
|
|
0.6 0.76923077 0.83333333 0.7826087 ]
|
|
|
|
mean value: 0.7100503120068337
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.72727273 0.64285714 0.5 0.53846154 0.75
|
|
0.66666667 0.66666667 0.76923077 0.75 ]
|
|
|
|
mean value: 0.6647519147519148
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.9 0.6 0.7 0.81818182
|
|
0.54545455 0.90909091 0.90909091 0.81818182]
|
|
|
|
mean value: 0.77
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.56428571 0.68571429 0.59285714 0.37142857 0.42142857 0.65909091
|
|
0.52272727 0.53787879 0.70454545 0.65909091]
|
|
|
|
mean value: 0.5719047619047619
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.61538462 0.6 0.375 0.4375 0.64285714
|
|
0.42857143 0.625 0.71428571 0.64285714]
|
|
|
|
mean value: 0.5581456043956045
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.53434014 0.47350907 0.3815167 0.38136268 0.38167262 0.37864709
|
|
0.38625383 0.3857789 0.41710877 0.39069295]
|
|
|
|
mean value: 0.4110882759094238
|
|
|
|
key: score_time
|
|
value: [0.0129323 0.01272464 0.01275206 0.01263762 0.01286578 0.01278806
|
|
0.0126791 0.01264119 0.01395893 0.01256704]
|
|
|
|
mean value: 0.012854671478271485
|
|
|
|
key: test_mcc
|
|
value: [0.77151675 0.77151675 0.88741197 0.63262663 0.75714286 0.87400737
|
|
1. 0.78334945 0.87400737 1. ]
|
|
|
|
mean value: 0.8351579150791348
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88235294 0.88235294 0.94117647 0.82352941 0.88235294 0.94117647
|
|
1. 0.88235294 0.94117647 1. ]
|
|
|
|
mean value: 0.9176470588235294
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.90909091 0.94736842 0.85714286 0.9 0.95652174
|
|
1. 0.9 0.95652174 1. ]
|
|
|
|
mean value: 0.9335736574638177
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.83333333 1. 0.81818182 0.9 0.91666667
|
|
1. 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.9218181818181819
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.9 0.9 0.9 1.
|
|
1. 0.81818182 1. 1. ]
|
|
|
|
mean value: 0.9518181818181818
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.85714286 0.95 0.80714286 0.87857143 0.91666667
|
|
1. 0.90909091 0.91666667 1. ]
|
|
|
|
mean value: 0.9092424242424243
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.83333333 0.9 0.75 0.81818182 0.91666667
|
|
1. 0.81818182 0.91666667 1. ]
|
|
|
|
mean value: 0.8786363636363637
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02650714 0.05811 0.05595326 0.04492354 0.07738733 0.04849219
|
|
0.03797841 0.04312611 0.06586456 0.07221007]
|
|
|
|
mean value: 0.05305526256561279
|
|
|
|
key: score_time
|
|
value: [0.01903915 0.03109527 0.02601171 0.02103806 0.02057695 0.01910973
|
|
0.02088594 0.0189662 0.02482319 0.02623796]
|
|
|
|
mean value: 0.022778415679931642
|
|
|
|
key: test_mcc
|
|
value: [ 0.13241022 0.38122129 -0.30550505 0.23975611 0.38122129 0.30389487
|
|
0.3385016 -0.01899343 0.06356417 0.30389487]
|
|
|
|
mean value: 0.18199659502726337
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.58823529 0.70588235 0.47058824 0.64705882 0.70588235 0.70588235
|
|
0.70588235 0.58823529 0.58823529 0.70588235]
|
|
|
|
mean value: 0.6411764705882353
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.7826087 0.64 0.75 0.7826087 0.8
|
|
0.81481481 0.72 0.69565217 0.8 ]
|
|
|
|
mean value: 0.7452351046698873
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.69230769 0.53333333 0.64285714 0.69230769 0.71428571
|
|
0.6875 0.64285714 0.66666667 0.71428571]
|
|
|
|
mean value: 0.6622764735264736
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.7 0.9 0.8 0.9 0.9 0.90909091
|
|
1. 0.81818182 0.72727273 0.90909091]
|
|
|
|
mean value: 0.8563636363636363
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.56428571 0.66428571 0.4 0.59285714 0.66428571 0.62121212
|
|
0.58333333 0.49242424 0.53030303 0.62121212]
|
|
|
|
mean value: 0.5734199134199134
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.64285714 0.47058824 0.6 0.64285714 0.66666667
|
|
0.6875 0.5625 0.53333333 0.66666667]
|
|
|
|
mean value: 0.597296918767507
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08099842 0.08981967 0.04990792 0.05527115 0.05714655 0.05031085
|
|
0.05187988 0.05764365 0.05021954 0.05180001]
|
|
|
|
mean value: 0.05949976444244385
|
|
|
|
key: score_time
|
|
value: [0.0333674 0.02936316 0.02458978 0.03190279 0.02809477 0.03452396
|
|
0.02786756 0.02454042 0.02761149 0.02961063]
|
|
|
|
mean value: 0.02914719581604004
|
|
|
|
key: test_mcc
|
|
value: [0.51428571 0.66299354 0.27142857 0.63262663 0.63262663 0.88273483
|
|
0.38251843 0.33371191 0.74242424 0.33371191]
|
|
|
|
mean value: 0.5389062395989187
|
|
|
|
key: train_mcc
|
|
value: [0.9306986 0.9587737 0.91649194 0.90340823 0.93172069 0.90411865
|
|
0.94559731 0.95906064 0.91761348 0.90330977]
|
|
|
|
mean value: 0.9270793013098675
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.82352941 0.64705882 0.82352941 0.82352941 0.94117647
|
|
0.70588235 0.70588235 0.88235294 0.70588235]
|
|
|
|
mean value: 0.7823529411764706
|
|
|
|
key: train_accuracy
|
|
value: [0.96732026 0.98039216 0.96078431 0.95424837 0.96732026 0.95424837
|
|
0.97385621 0.98039216 0.96078431 0.95424837]
|
|
|
|
mean value: 0.965359477124183
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.86956522 0.7 0.85714286 0.85714286 0.95238095
|
|
0.76190476 0.7826087 0.90909091 0.7826087 ]
|
|
|
|
mean value: 0.827244494635799
|
|
|
|
key: train_fscore
|
|
value: [0.97409326 0.98445596 0.96875 0.96410256 0.97435897 0.96373057
|
|
0.97916667 0.98429319 0.96875 0.96335079]
|
|
|
|
mean value: 0.9725051976931911
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76923077 0.7 0.81818182 0.81818182 1.
|
|
0.8 0.75 0.90909091 0.75 ]
|
|
|
|
mean value: 0.8114685314685315
|
|
|
|
key: train_precision
|
|
value: [0.95918367 0.96938776 0.95876289 0.94 0.95 0.93939394
|
|
0.95918367 0.96907216 0.94897959 0.94845361]
|
|
|
|
mean value: 0.9542417293065305
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.7 0.9 0.9 0.90909091
|
|
0.72727273 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8481818181818181
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.97894737 0.98947368 1. 0.9893617
|
|
1. 1. 0.9893617 0.9787234 ]
|
|
|
|
mean value: 0.9915341545352744
|
|
|
|
key: test_roc_auc
|
|
value: [0.75714286 0.78571429 0.63571429 0.80714286 0.80714286 0.95454545
|
|
0.6969697 0.65909091 0.87121212 0.65909091]
|
|
|
|
mean value: 0.7633766233766234
|
|
|
|
key: train_roc_auc
|
|
value: [0.96025408 0.97413793 0.95499093 0.9430127 0.95689655 0.94383339
|
|
0.96610169 0.97457627 0.95230797 0.94698882]
|
|
|
|
mean value: 0.9573100346025291
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.76923077 0.53846154 0.75 0.75 0.90909091
|
|
0.61538462 0.64285714 0.83333333 0.64285714]
|
|
|
|
mean value: 0.7117882117882118
|
|
|
|
key: train_jcc
|
|
value: [0.94949495 0.96938776 0.93939394 0.93069307 0.95 0.93
|
|
0.95918367 0.96907216 0.93939394 0.92929293]
|
|
|
|
mean value: 0.946591242040257
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.411973 0.34421086 0.39803886 0.35081291 0.3494761 0.33145618
|
|
0.3288908 0.35393381 0.34595108 0.33966637]
|
|
|
|
mean value: 0.3554409980773926
|
|
|
|
key: score_time
|
|
value: [0.03101277 0.03190875 0.02901888 0.02524614 0.03131557 0.02620721
|
|
0.0313201 0.02892303 0.03440547 0.02825022]
|
|
|
|
mean value: 0.02976081371307373
|
|
|
|
key: test_mcc
|
|
value: [0.51428571 0.66299354 0.27142857 0.63262663 0.63262663 0.88273483
|
|
0.38251843 0.33371191 0.63262663 0.33371191]
|
|
|
|
mean value: 0.5279264781376682
|
|
|
|
key: train_mcc
|
|
value: [0.9306986 0.9587737 0.91649194 0.90340823 0.94445829 0.90411865
|
|
0.94559731 0.95906064 0.93118521 0.90330977]
|
|
|
|
mean value: 0.929710233179641
|
|
|
|
key: test_accuracy
|
|
value: [0.76470588 0.82352941 0.64705882 0.82352941 0.82352941 0.94117647
|
|
0.70588235 0.70588235 0.82352941 0.70588235]
|
|
|
|
mean value: 0.7764705882352941
|
|
|
|
key: train_accuracy
|
|
value: [0.96732026 0.98039216 0.96078431 0.95424837 0.97385621 0.95424837
|
|
0.97385621 0.98039216 0.96732026 0.95424837]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.86956522 0.7 0.85714286 0.85714286 0.95238095
|
|
0.76190476 0.7826087 0.85714286 0.7826087 ]
|
|
|
|
mean value: 0.8220496894409938
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:107: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:110: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.97409326 0.98445596 0.96875 0.96410256 0.97916667 0.96373057
|
|
0.97916667 0.98429319 0.97382199 0.96335079]
|
|
|
|
mean value: 0.9734931658768399
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76923077 0.7 0.81818182 0.81818182 1.
|
|
0.8 0.75 0.9 0.75 ]
|
|
|
|
mean value: 0.8105594405594406
|
|
|
|
key: train_precision
|
|
value: [0.95918367 0.96938776 0.95876289 0.94 0.96907216 0.93939394
|
|
0.95918367 0.96907216 0.95876289 0.94845361]
|
|
|
|
mean value: 0.9571272752774962
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.7 0.9 0.9 0.90909091
|
|
0.72727273 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8390909090909091
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.97894737 0.98947368 0.98947368 0.9893617
|
|
1. 1. 0.9893617 0.9787234 ]
|
|
|
|
mean value: 0.990481522956327
|
|
|
|
key: test_roc_auc
|
|
value: [0.75714286 0.78571429 0.63571429 0.80714286 0.80714286 0.95454545
|
|
0.6969697 0.65909091 0.82575758 0.65909091]
|
|
|
|
mean value: 0.7588311688311689
|
|
|
|
key: train_roc_auc
|
|
value: [0.96025408 0.97413793 0.95499093 0.9430127 0.96887477 0.94383339
|
|
0.96610169 0.97457627 0.96078255 0.94698882]
|
|
|
|
mean value: 0.9593553143712085
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.76923077 0.53846154 0.75 0.75 0.90909091
|
|
0.61538462 0.64285714 0.75 0.64285714]
|
|
|
|
mean value: 0.7034548784548784
|
|
|
|
key: train_jcc
|
|
value: [0.94949495 0.96938776 0.93939394 0.93069307 0.95918367 0.93
|
|
0.95918367 0.96907216 0.94897959 0.92929293]
|
|
|
|
mean value: 0.9484681746314754
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04003 0.03358912 0.06656361 0.0678463 0.10102797 0.07133985
|
|
0.07167172 0.07186866 0.08310914 0.07368183]
|
|
|
|
mean value: 0.06807281970977783
|
|
|
|
key: score_time
|
|
value: [0.01230335 0.0230813 0.02241945 0.02380037 0.02342224 0.01903343
|
|
0.03061485 0.0213964 0.02022529 0.01746082]
|
|
|
|
mean value: 0.021375751495361327
|
|
|
|
key: test_mcc
|
|
value: [0.52295779 0.71562645 0.4719399 0.71562645 0.62641448 0.82572282
|
|
0.44038551 0.52295779 0.71818182 0.42727273]
|
|
|
|
mean value: 0.5987085733914107
|
|
|
|
key: train_mcc
|
|
value: [0.8518477 0.80951848 0.83085028 0.79896965 0.83088812 0.84132139
|
|
0.8518477 0.83068309 0.84166312 0.89500244]
|
|
|
|
mean value: 0.8382591983762472
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.85714286 0.71428571 0.85714286 0.80952381 0.9047619
|
|
0.71428571 0.76190476 0.85714286 0.71428571]
|
|
|
|
mean value: 0.7952380952380952
|
|
|
|
key: train_accuracy
|
|
value: [0.92592593 0.9047619 0.91534392 0.8994709 0.91534392 0.92063492
|
|
0.92592593 0.91534392 0.92063492 0.94708995]
|
|
|
|
mean value: 0.919047619047619
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.84210526 0.75 0.84210526 0.77777778 0.9
|
|
0.7 0.7826087 0.85714286 0.72727273]
|
|
|
|
mean value: 0.7915854689424483
|
|
|
|
key: train_fscore
|
|
value: [0.92631579 0.90526316 0.91666667 0.90052356 0.91489362 0.92063492
|
|
0.92553191 0.91489362 0.91891892 0.94791667]
|
|
|
|
mean value: 0.9191558829401187
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.88888889 0.64285714 0.88888889 0.875 1.
|
|
0.77777778 0.75 0.9 0.72727273]
|
|
|
|
mean value: 0.8228463203463203
|
|
|
|
key: train_precision
|
|
value: [0.92631579 0.90526316 0.90721649 0.89583333 0.92473118 0.91578947
|
|
0.92553191 0.91489362 0.93406593 0.92857143]
|
|
|
|
mean value: 0.917821232657928
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.9 0.8 0.7 0.81818182
|
|
0.63636364 0.81818182 0.81818182 0.72727273]
|
|
|
|
mean value: 0.7718181818181818
|
|
|
|
key: train_recall
|
|
value: [0.92631579 0.90526316 0.92631579 0.90526316 0.90526316 0.92553191
|
|
0.92553191 0.91489362 0.90425532 0.96808511]
|
|
|
|
mean value: 0.9206718924972005
|
|
|
|
key: test_roc_auc
|
|
value: [0.75909091 0.85454545 0.72272727 0.85454545 0.80454545 0.90909091
|
|
0.71818182 0.75909091 0.85909091 0.71363636]
|
|
|
|
mean value: 0.7954545454545454
|
|
|
|
key: train_roc_auc
|
|
value: [0.92592385 0.90475924 0.91528555 0.89944009 0.91539754 0.92066069
|
|
0.92592385 0.91534155 0.92054871 0.94720045]
|
|
|
|
mean value: 0.9190481522956326
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.72727273 0.6 0.72727273 0.63636364 0.81818182
|
|
0.53846154 0.64285714 0.75 0.57142857]
|
|
|
|
mean value: 0.6595171495171496
|
|
|
|
key: train_jcc
|
|
value: [0.8627451 0.82692308 0.84615385 0.81904762 0.84313725 0.85294118
|
|
0.86138614 0.84313725 0.85 0.9009901 ]
|
|
|
|
mean value: 0.850646156406203
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.61443114 0.991431 1.31764698 1.33511066 1.55657887 1.8804636
|
|
2.35004854 1.59028721 1.81239128 2.00124836]
|
|
|
|
mean value: 1.644963765144348
|
|
|
|
key: score_time
|
|
value: [0.03566933 0.01466346 0.01258254 0.01850414 0.02412224 0.02417207
|
|
0.03879929 0.02366471 0.02081037 0.02138114]
|
|
|
|
mean value: 0.023436927795410158
|
|
|
|
key: test_mcc
|
|
value: [0.43007562 0.61818182 0.55161872 1. 0.80909091 0.82572282
|
|
0.55161872 0.62641448 0.90909091 0.55161872]
|
|
|
|
mean value: 0.6873432731232932
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.91555606 1. 1. ]
|
|
|
|
mean value: 0.9915556059051258
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.80952381 0.76190476 1. 0.9047619 0.9047619
|
|
0.76190476 0.80952381 0.95238095 0.76190476]
|
|
|
|
mean value: 0.8380952380952381
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.95767196 1. 1. ]
|
|
|
|
mean value: 0.9957671957671957
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.8 0.7826087 1. 0.9 0.9
|
|
0.73684211 0.83333333 0.95238095 0.73684211]
|
|
|
|
mean value: 0.8308673858559442
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.95789474 1. 1. ]
|
|
|
|
mean value: 0.9957894736842106
|
|
|
|
key: test_precision
|
|
value: [0.75 0.8 0.69230769 1. 0.9 1.
|
|
0.875 0.76923077 1. 0.875 ]
|
|
|
|
mean value: 0.8661538461538462
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.94791667 1. 1. ]
|
|
|
|
mean value: 0.9947916666666666
|
|
|
|
key: test_recall
|
|
value: [0.6 0.8 0.9 1. 0.9 0.81818182
|
|
0.63636364 0.90909091 0.90909091 0.63636364]
|
|
|
|
mean value: 0.8109090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.96808511 1. 1. ]
|
|
|
|
mean value: 0.9968085106382979
|
|
|
|
key: test_roc_auc
|
|
value: [0.70909091 0.80909091 0.76818182 1. 0.90454545 0.90909091
|
|
0.76818182 0.80454545 0.95454545 0.76818182]
|
|
|
|
mean value: 0.8395454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.95772676 1. 1. ]
|
|
|
|
mean value: 0.9957726763717805
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.66666667 0.64285714 1. 0.81818182 0.81818182
|
|
0.58333333 0.71428571 0.90909091 0.58333333]
|
|
|
|
mean value: 0.7235930735930736
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.91919192 1. 1. ]
|
|
|
|
mean value: 0.9919191919191919
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.03653574 0.01316571 0.01315188 0.0131731 0.01319766 0.01305985
|
|
0.01309562 0.01336765 0.01304221 0.01303864]
|
|
|
|
mean value: 0.015482807159423828
|
|
|
|
key: score_time
|
|
value: [0.01301122 0.01235342 0.01248431 0.01237154 0.01234913 0.01239204
|
|
0.01234293 0.01240659 0.01235342 0.01222396]
|
|
|
|
mean value: 0.012428855895996094
|
|
|
|
key: test_mcc
|
|
value: [ 0.44038551 -0.13762047 0.39196475 0.33709993 0.52727273 0.71562645
|
|
0.14545455 0.33709993 0.45226702 0.52295779]
|
|
|
|
mean value: 0.37325081711851577
|
|
|
|
key: train_mcc
|
|
value: [0.55158352 0.46765481 0.54179779 0.46109894 0.53609614 0.47825095
|
|
0.44012799 0.55442155 0.4230863 0.52563909]
|
|
|
|
mean value: 0.4979757093671649
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.42857143 0.66666667 0.66666667 0.76190476 0.85714286
|
|
0.57142857 0.66666667 0.71428571 0.76190476]
|
|
|
|
mean value: 0.680952380952381
|
|
|
|
key: train_accuracy
|
|
value: [0.77248677 0.71957672 0.76719577 0.73015873 0.76719577 0.73544974
|
|
0.71957672 0.77248677 0.69312169 0.75661376]
|
|
|
|
mean value: 0.7433862433862434
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.45454545 0.72 0.58823529 0.76190476 0.86956522
|
|
0.57142857 0.72 0.76923077 0.7826087 ]
|
|
|
|
mean value: 0.696479149154341
|
|
|
|
key: train_fscore
|
|
value: [0.7902439 0.76233184 0.78640777 0.72432432 0.77777778 0.75490196
|
|
0.70718232 0.7902439 0.74336283 0.77884615]
|
|
|
|
mean value: 0.7615622779466328
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.41666667 0.6 0.71428571 0.72727273 0.83333333
|
|
0.6 0.64285714 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6617748917748918
|
|
|
|
key: train_precision
|
|
value: [0.73636364 0.6640625 0.72972973 0.74444444 0.74757282 0.7
|
|
0.73563218 0.72972973 0.63636364 0.71052632]
|
|
|
|
mean value: 0.7134424991862677
|
|
|
|
key: test_recall
|
|
value: [0.8 0.5 0.9 0.5 0.8 0.90909091
|
|
0.54545455 0.81818182 0.90909091 0.81818182]
|
|
|
|
mean value: 0.75
|
|
|
|
key: train_recall
|
|
value: [0.85263158 0.89473684 0.85263158 0.70526316 0.81052632 0.81914894
|
|
0.68085106 0.86170213 0.89361702 0.86170213]
|
|
|
|
mean value: 0.8232810750279955
|
|
|
|
key: test_roc_auc
|
|
value: [0.71818182 0.43181818 0.67727273 0.65909091 0.76363636 0.85454545
|
|
0.57272727 0.65909091 0.70454545 0.75909091]
|
|
|
|
mean value: 0.68
|
|
|
|
key: train_roc_auc
|
|
value: [0.77206047 0.71864502 0.76674132 0.73029115 0.76696529 0.73589026
|
|
0.7193729 0.77295633 0.69417693 0.75716685]
|
|
|
|
mean value: 0.7434266517357223
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.29411765 0.5625 0.41666667 0.61538462 0.76923077
|
|
0.4 0.5625 0.625 0.64285714]
|
|
|
|
mean value: 0.5459685412626589
|
|
|
|
key: train_jcc
|
|
value: [0.65322581 0.61594203 0.648 0.56779661 0.63636364 0.60629921
|
|
0.54700855 0.65322581 0.5915493 0.63779528]
|
|
|
|
mean value: 0.6157206219394032
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01332068 0.01345658 0.01338673 0.01330256 0.01338434 0.01339674
|
|
0.01414132 0.01341271 0.01346302 0.01340342]
|
|
|
|
mean value: 0.013466811180114746
|
|
|
|
key: score_time
|
|
value: [0.0123136 0.01233149 0.01238203 0.01235437 0.01238656 0.01232052
|
|
0.02520466 0.01238585 0.01245117 0.0123682 ]
|
|
|
|
mean value: 0.013649845123291015
|
|
|
|
key: test_mcc
|
|
value: [0.23373675 0.42817442 0.14545455 0.45226702 0.42817442 0.55161872
|
|
0.06741999 0.23636364 0.55161872 0.60302269]
|
|
|
|
mean value: 0.36978509083764477
|
|
|
|
key: train_mcc
|
|
value: [0.55646909 0.49995455 0.56569532 0.45906255 0.53448943 0.46424351
|
|
0.54251375 0.55585218 0.49572783 0.4861571 ]
|
|
|
|
mean value: 0.5160165310554382
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.66666667 0.57142857 0.71428571 0.66666667 0.76190476
|
|
0.52380952 0.61904762 0.76190476 0.76190476]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.76719577 0.74074074 0.77248677 0.71957672 0.75661376 0.72486772
|
|
0.76190476 0.77248677 0.73544974 0.73544974]
|
|
|
|
mean value: 0.7486772486772486
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.46153846 0.57142857 0.625 0.46153846 0.73684211
|
|
0.44444444 0.63636364 0.73684211 0.70588235]
|
|
|
|
mean value: 0.5935435694336623
|
|
|
|
key: train_fscore
|
|
value: [0.73170732 0.7030303 0.73939394 0.67484663 0.7195122 0.68292683
|
|
0.72392638 0.74556213 0.6835443 0.69512195]
|
|
|
|
mean value: 0.7099571975217122
|
|
|
|
key: test_precision
|
|
value: [0.625 1. 0.54545455 0.83333333 1. 0.875
|
|
0.57142857 0.63636364 0.875 1. ]
|
|
|
|
mean value: 0.7961580086580087
|
|
|
|
key: train_precision
|
|
value: [0.86956522 0.82857143 0.87142857 0.80882353 0.85507246 0.8
|
|
0.85507246 0.84 0.84375 0.81428571]
|
|
|
|
mean value: 0.8386569388625016
|
|
|
|
key: test_recall
|
|
value: [0.5 0.3 0.6 0.5 0.3 0.63636364
|
|
0.36363636 0.63636364 0.63636364 0.54545455]
|
|
|
|
mean value: 0.5018181818181818
|
|
|
|
key: train_recall
|
|
value: [0.63157895 0.61052632 0.64210526 0.57894737 0.62105263 0.59574468
|
|
0.62765957 0.67021277 0.57446809 0.60638298]
|
|
|
|
mean value: 0.6158678611422173
|
|
|
|
key: test_roc_auc
|
|
value: [0.61363636 0.65 0.57272727 0.70454545 0.65 0.76818182
|
|
0.53181818 0.61818182 0.76818182 0.77272727]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_roc_auc
|
|
value: [0.76791713 0.74143337 0.77318029 0.72032475 0.75733483 0.72418813
|
|
0.76119821 0.77194849 0.73460246 0.73477044]
|
|
|
|
mean value: 0.7486898096304592
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.3 0.4 0.45454545 0.3 0.58333333
|
|
0.28571429 0.46666667 0.58333333 0.54545455]
|
|
|
|
mean value: 0.43036630036630036
|
|
|
|
key: train_jcc
|
|
value: [0.57692308 0.54205607 0.58653846 0.50925926 0.56190476 0.51851852
|
|
0.56730769 0.59433962 0.51923077 0.53271028]
|
|
|
|
mean value: 0.5508788517464236
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01281309 0.01259708 0.01266217 0.0125277 0.01282144 0.01259828
|
|
0.01269078 0.01281691 0.01241755 0.0138123 ]
|
|
|
|
mean value: 0.012775731086730958
|
|
|
|
key: score_time
|
|
value: [0.02045655 0.03969097 0.05234361 0.0366714 0.03734803 0.03721189
|
|
0.03660083 0.03660321 0.0360918 0.05223918]
|
|
|
|
mean value: 0.038525748252868655
|
|
|
|
key: test_mcc
|
|
value: [-0.06741999 0.23373675 0.03015113 0.33636364 0.13858047 0.14545455
|
|
-0.26593594 0.42727273 0.33636364 0.18090681]
|
|
|
|
mean value: 0.14954737717597608
|
|
|
|
key: train_mcc
|
|
value: [0.57375166 0.50637592 0.61142844 0.55051844 0.49793339 0.55594205
|
|
0.6099783 0.5498651 0.55978224 0.55189788]
|
|
|
|
mean value: 0.5567473416942506
|
|
|
|
key: test_accuracy
|
|
value: [0.47619048 0.61904762 0.52380952 0.66666667 0.57142857 0.57142857
|
|
0.38095238 0.71428571 0.66666667 0.57142857]
|
|
|
|
mean value: 0.5761904761904761
|
|
|
|
key: train_accuracy
|
|
value: [0.78306878 0.75132275 0.8042328 0.77248677 0.74603175 0.77777778
|
|
0.8042328 0.77248677 0.77777778 0.76719577]
|
|
|
|
mean value: 0.7756613756613756
|
|
|
|
key: test_fscore
|
|
value: [0.35294118 0.55555556 0.375 0.66666667 0.4 0.57142857
|
|
0.13333333 0.72727273 0.66666667 0.47058824]
|
|
|
|
mean value: 0.49194529326882264
|
|
|
|
key: train_fscore
|
|
value: [0.76571429 0.73743017 0.79558011 0.75706215 0.72727273 0.77173913
|
|
0.79558011 0.75428571 0.76136364 0.73170732]
|
|
|
|
mean value: 0.7597735346629213
|
|
|
|
key: test_precision
|
|
value: [0.42857143 0.625 0.5 0.63636364 0.6 0.6
|
|
0.25 0.72727273 0.7 0.66666667]
|
|
|
|
mean value: 0.5733874458874458
|
|
|
|
key: train_precision
|
|
value: [0.8375 0.78571429 0.8372093 0.81707317 0.79012346 0.78888889
|
|
0.82758621 0.81481481 0.81707317 0.85714286]
|
|
|
|
mean value: 0.8173126154036517
|
|
|
|
key: test_recall
|
|
value: [0.3 0.5 0.3 0.7 0.3 0.54545455
|
|
0.09090909 0.72727273 0.63636364 0.36363636]
|
|
|
|
mean value: 0.44636363636363635
|
|
|
|
key: train_recall
|
|
value: [0.70526316 0.69473684 0.75789474 0.70526316 0.67368421 0.75531915
|
|
0.76595745 0.70212766 0.71276596 0.63829787]
|
|
|
|
mean value: 0.7111310190369541
|
|
|
|
key: test_roc_auc
|
|
value: [0.46818182 0.61363636 0.51363636 0.66818182 0.55909091 0.57272727
|
|
0.39545455 0.71363636 0.66818182 0.58181818]
|
|
|
|
mean value: 0.5754545454545454
|
|
|
|
key: train_roc_auc
|
|
value: [0.78348264 0.75162374 0.80447928 0.77284434 0.74641657 0.77765957
|
|
0.80403135 0.77211646 0.77743561 0.76651736]
|
|
|
|
mean value: 0.7756606942889137
|
|
|
|
key: test_jcc
|
|
value: [0.21428571 0.38461538 0.23076923 0.5 0.25 0.4
|
|
0.07142857 0.57142857 0.5 0.30769231]
|
|
|
|
mean value: 0.34302197802197804
|
|
|
|
key: train_jcc
|
|
value: [0.62037037 0.5840708 0.66055046 0.60909091 0.57142857 0.62831858
|
|
0.66055046 0.60550459 0.6146789 0.57692308]
|
|
|
|
mean value: 0.6131486712013626
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01701498 0.01680541 0.01711583 0.01694584 0.01700974 0.01702642
|
|
0.01688266 0.0170269 0.01668429 0.01705098]
|
|
|
|
mean value: 0.016956305503845213
|
|
|
|
key: score_time
|
|
value: [0.01340604 0.01336288 0.01338744 0.01337409 0.01337981 0.01351428
|
|
0.01340461 0.01340222 0.01311612 0.01343751]
|
|
|
|
mean value: 0.013378500938415527
|
|
|
|
key: test_mcc
|
|
value: [0.13762047 0.24120908 0.44038551 0.23373675 0.36244122 0.71818182
|
|
0.14545455 0.42727273 0.61818182 0.52727273]
|
|
|
|
mean value: 0.38517566540238385
|
|
|
|
key: train_mcc
|
|
value: [0.73654755 0.68655917 0.75666293 0.73549832 0.69804157 0.73867014
|
|
0.75694773 0.81109216 0.75994222 0.76164115]
|
|
|
|
mean value: 0.7441602942290075
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.61904762 0.71428571 0.61904762 0.66666667 0.85714286
|
|
0.57142857 0.71428571 0.80952381 0.76190476]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: train_accuracy
|
|
value: [0.86772487 0.84126984 0.87830688 0.86772487 0.84656085 0.86772487
|
|
0.87830688 0.9047619 0.87830688 0.87830688]
|
|
|
|
mean value: 0.8708994708994708
|
|
|
|
key: test_fscore
|
|
value: [0.52631579 0.5 0.72727273 0.55555556 0.53333333 0.85714286
|
|
0.57142857 0.72727273 0.81818182 0.76190476]
|
|
|
|
mean value: 0.6578408141566037
|
|
|
|
key: train_fscore
|
|
value: [0.86486486 0.83333333 0.87830688 0.86772487 0.83798883 0.8603352
|
|
0.87567568 0.9010989 0.87150838 0.8700565 ]
|
|
|
|
mean value: 0.86608934204143
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.66666667 0.66666667 0.625 0.8 0.9
|
|
0.6 0.72727273 0.81818182 0.8 ]
|
|
|
|
mean value: 0.7159343434343435
|
|
|
|
key: train_precision
|
|
value: [0.88888889 0.88235294 0.88297872 0.87234043 0.89285714 0.90588235
|
|
0.89010989 0.93181818 0.91764706 0.92771084]
|
|
|
|
mean value: 0.8992586448924944
|
|
|
|
key: test_recall
|
|
value: [0.5 0.4 0.8 0.5 0.4 0.81818182
|
|
0.54545455 0.72727273 0.81818182 0.72727273]
|
|
|
|
mean value: 0.6236363636363637
|
|
|
|
key: train_recall
|
|
value: [0.84210526 0.78947368 0.87368421 0.86315789 0.78947368 0.81914894
|
|
0.86170213 0.87234043 0.82978723 0.81914894]
|
|
|
|
mean value: 0.8360022396416573
|
|
|
|
key: test_roc_auc
|
|
value: [0.56818182 0.60909091 0.71818182 0.61363636 0.65454545 0.85909091
|
|
0.57272727 0.71363636 0.80909091 0.76363636]
|
|
|
|
mean value: 0.6881818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [0.86786114 0.84154535 0.87833147 0.86774916 0.8468645 0.8674692
|
|
0.87821948 0.90459127 0.87805151 0.87799552]
|
|
|
|
mean value: 0.8708678611422173
|
|
|
|
key: test_jcc
|
|
value: [0.35714286 0.33333333 0.57142857 0.38461538 0.36363636 0.75
|
|
0.4 0.57142857 0.69230769 0.61538462]
|
|
|
|
mean value: 0.5039277389277389
|
|
|
|
key: train_jcc
|
|
value: [0.76190476 0.71428571 0.78301887 0.76635514 0.72115385 0.75490196
|
|
0.77884615 0.82 0.77227723 0.77 ]
|
|
|
|
mean value: 0.7642743672809006
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.34039497 1.35265207 1.41161752 1.35997891 1.55491352 1.25403571
|
|
1.12641239 1.19176412 0.80105114 0.87126184]
|
|
|
|
mean value: 1.2264082193374635
|
|
|
|
key: score_time
|
|
value: [0.01979089 0.02390909 0.03533006 0.0225184 0.02177119 0.02182436
|
|
0.01981211 0.01836801 0.01252675 0.01538301]
|
|
|
|
mean value: 0.021123385429382323
|
|
|
|
key: test_mcc
|
|
value: [0.23636364 0.61818182 0.63305416 0.52727273 0.80909091 0.67419986
|
|
0.42727273 0.53935989 0.71818182 0.67419986]
|
|
|
|
mean value: 0.5857177416208371
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.80952381 0.80952381 0.76190476 0.9047619 0.80952381
|
|
0.71428571 0.76190476 0.85714286 0.80952381]
|
|
|
|
mean value: 0.7857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.8 0.81818182 0.76190476 0.9 0.77777778
|
|
0.72727273 0.8 0.85714286 0.77777778]
|
|
|
|
mean value: 0.782005772005772
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 0.8 0.75 0.72727273 0.9 1.
|
|
0.72727273 0.71428571 0.9 1. ]
|
|
|
|
mean value: 0.8118831168831169
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.8 0.9 0.8 0.9 0.63636364
|
|
0.72727273 0.90909091 0.81818182 0.63636364]
|
|
|
|
mean value: 0.7727272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.80909091 0.81363636 0.76363636 0.90454545 0.81818182
|
|
0.71363636 0.75454545 0.85909091 0.81818182]
|
|
|
|
mean value: 0.7872727272727272
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.66666667 0.69230769 0.61538462 0.81818182 0.63636364
|
|
0.57142857 0.66666667 0.75 0.63636364]
|
|
|
|
mean value: 0.6481934731934732
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01981091 0.0168035 0.01414537 0.01470256 0.01423645 0.01411271
|
|
0.01426768 0.01455116 0.01446676 0.01585817]
|
|
|
|
mean value: 0.01529552936553955
|
|
|
|
key: score_time
|
|
value: [0.01231694 0.0095036 0.00918722 0.00877905 0.00922728 0.00901771
|
|
0.00886106 0.00959349 0.00907612 0.00917816]
|
|
|
|
mean value: 0.009474062919616699
|
|
|
|
key: test_mcc
|
|
value: [0.80909091 1. 0.55161872 0.53935989 1. 0.90909091
|
|
0.80909091 0.71562645 0.90909091 0.90829511]
|
|
|
|
mean value: 0.8151263804000601
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 1. 0.76190476 0.76190476 1. 0.95238095
|
|
0.9047619 0.85714286 0.95238095 0.95238095]
|
|
|
|
mean value: 0.9047619047619048
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.9 1. 0.7826087 0.70588235 1. 0.95238095
|
|
0.90909091 0.86956522 0.95238095 0.95652174]
|
|
|
|
mean value: 0.9028430818967903
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9 1. 0.69230769 0.85714286 1. 1.
|
|
0.90909091 0.83333333 1. 0.91666667]
|
|
|
|
mean value: 0.9108541458541458
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 0.9 0.6 1. 0.90909091
|
|
0.90909091 0.90909091 0.90909091 1. ]
|
|
|
|
mean value: 0.9036363636363636
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90454545 1. 0.76818182 0.75454545 1. 0.95454545
|
|
0.90454545 0.85454545 0.95454545 0.95 ]
|
|
|
|
mean value: 0.9045454545454545
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 1. 0.64285714 0.54545455 1. 0.90909091
|
|
0.83333333 0.76923077 0.90909091 0.91666667]
|
|
|
|
mean value: 0.8343906093906094
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09894037 0.09839034 0.09680152 0.09823012 0.09769559 0.09740853
|
|
0.09786224 0.09949851 0.09909749 0.09787583]
|
|
|
|
mean value: 0.09818005561828613
|
|
|
|
key: score_time
|
|
value: [0.01770234 0.01870775 0.01915932 0.01832509 0.01789999 0.01840591
|
|
0.01811194 0.0182302 0.01793718 0.01793575]
|
|
|
|
mean value: 0.01824154853820801
|
|
|
|
key: test_mcc
|
|
value: [0.44038551 0.62641448 0.60302269 0.42727273 0.90829511 0.80909091
|
|
0.44038551 0.53935989 0.71818182 0.82572282]
|
|
|
|
mean value: 0.6338131459152349
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.80952381 0.76190476 0.71428571 0.95238095 0.9047619
|
|
0.71428571 0.76190476 0.85714286 0.9047619 ]
|
|
|
|
mean value: 0.8095238095238095
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.77777778 0.8 0.7 0.94736842 0.90909091
|
|
0.7 0.8 0.85714286 0.9 ]
|
|
|
|
mean value: 0.8118652692336903
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.875 0.66666667 0.7 1. 0.90909091
|
|
0.77777778 0.71428571 0.9 1. ]
|
|
|
|
mean value: 0.8209487734487735
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.7 1. 0.7 0.9 0.90909091
|
|
0.63636364 0.90909091 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8190909090909091
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71818182 0.80454545 0.77272727 0.71363636 0.95 0.90454545
|
|
0.71818182 0.75454545 0.85909091 0.90909091]
|
|
|
|
mean value: 0.8104545454545454
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.63636364 0.66666667 0.53846154 0.9 0.83333333
|
|
0.53846154 0.66666667 0.75 0.81818182]
|
|
|
|
mean value: 0.6919563769563769
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0100956 0.01037574 0.00979137 0.00928307 0.00931144 0.00919366
|
|
0.01059389 0.01200008 0.00972986 0.01077032]
|
|
|
|
mean value: 0.010114502906799317
|
|
|
|
key: score_time
|
|
value: [0.00959349 0.00903225 0.00895309 0.00887442 0.00888252 0.00879502
|
|
0.01142311 0.01174235 0.01009965 0.00888896]
|
|
|
|
mean value: 0.009628486633300782
|
|
|
|
key: test_mcc
|
|
value: [0.13762047 0.23373675 0.24771685 0.23636364 0.53935989 0.53935989
|
|
0.06741999 0.52295779 0.82572282 0.71818182]
|
|
|
|
mean value: 0.40684398983090986
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.61904762 0.61904762 0.61904762 0.76190476 0.76190476
|
|
0.52380952 0.76190476 0.9047619 0.85714286]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.52631579 0.55555556 0.63636364 0.6 0.70588235 0.8
|
|
0.44444444 0.7826087 0.9 0.85714286]
|
|
|
|
mean value: 0.6808313331573528
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.625 0.58333333 0.6 0.85714286 0.71428571
|
|
0.57142857 0.75 1. 0.9 ]
|
|
|
|
mean value: 0.7156746031746032
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.7 0.6 0.6 0.90909091
|
|
0.36363636 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.6627272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.56818182 0.61363636 0.62272727 0.61818182 0.75454545 0.75454545
|
|
0.53181818 0.75909091 0.90909091 0.85909091]
|
|
|
|
mean value: 0.6990909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.35714286 0.38461538 0.46666667 0.42857143 0.54545455 0.66666667
|
|
0.28571429 0.64285714 0.81818182 0.75 ]
|
|
|
|
mean value: 0.5345870795870796
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.11
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.31620884 1.31651926 1.31475592 1.30594683 1.28883219 1.36383224
|
|
1.30705667 1.3408339 1.34590888 1.31465459]
|
|
|
|
mean value: 1.321454930305481
|
|
|
|
key: score_time
|
|
value: [0.09206462 0.09813571 0.09290099 0.09021497 0.09371018 0.09852195
|
|
0.09075403 0.09710979 0.09322071 0.09802985]
|
|
|
|
mean value: 0.09446628093719482
|
|
|
|
key: test_mcc
|
|
value: [0.52295779 0.71562645 0.39196475 0.52295779 0.90909091 1.
|
|
0.71818182 0.71562645 1. 0.61818182]
|
|
|
|
mean value: 0.7114587764939949
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.85714286 0.66666667 0.76190476 0.95238095 1.
|
|
0.85714286 0.85714286 1. 0.80952381]
|
|
|
|
mean value: 0.8523809523809524
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.84210526 0.72 0.73684211 0.95238095 1.
|
|
0.85714286 0.86956522 1. 0.81818182]
|
|
|
|
mean value: 0.8533060318781143
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.88888889 0.6 0.77777778 0.90909091 1.
|
|
0.9 0.83333333 1. 0.81818182]
|
|
|
|
mean value: 0.8505050505050505
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.9 0.7 1. 1.
|
|
0.81818182 0.90909091 1. 0.81818182]
|
|
|
|
mean value: 0.8645454545454545
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75909091 0.85454545 0.67727273 0.75909091 0.95454545 1.
|
|
0.85909091 0.85454545 1. 0.80909091]
|
|
|
|
mean value: 0.8527272727272728
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.58333333 0.72727273 0.5625 0.58333333 0.90909091 1.
|
|
0.75 0.76923077 1. 0.69230769]
|
|
|
|
mean value: 0.7577068764568765
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.85998034 1.44642282 1.04460621 0.97317791 0.90930486 0.89995694
|
|
0.87365675 0.88928413 0.94170761 0.95756435]
|
|
|
|
mean value: 0.9795661926269531
|
|
|
|
key: score_time
|
|
value: [0.2091279 0.2227211 0.17214561 0.14424253 0.16997743 0.16527343
|
|
0.17143083 0.20080829 0.17865419 0.12289047]
|
|
|
|
mean value: 0.17572717666625975
|
|
|
|
key: test_mcc
|
|
value: [0.52295779 0.53935989 0.39196475 0.62641448 0.80909091 0.82275335
|
|
0.52727273 0.61818182 1. 0.71818182]
|
|
|
|
mean value: 0.6576177533597088
|
|
|
|
key: train_mcc
|
|
value: [0.95788064 0.95788064 0.96830553 0.95788064 0.95788064 0.94757483
|
|
0.97905701 0.93736014 0.95789003 0.95789003]
|
|
|
|
mean value: 0.9579600124491399
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.76190476 0.66666667 0.80952381 0.9047619 0.9047619
|
|
0.76190476 0.80952381 1. 0.85714286]
|
|
|
|
mean value: 0.8238095238095238
|
|
|
|
key: train_accuracy
|
|
value: [0.97883598 0.97883598 0.98412698 0.97883598 0.97883598 0.97354497
|
|
0.98941799 0.96825397 0.97883598 0.97883598]
|
|
|
|
mean value: 0.9788359788359788
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.70588235 0.72 0.77777778 0.9 0.91666667
|
|
0.76190476 0.81818182 1. 0.85714286]
|
|
|
|
mean value: 0.8194398339878216
|
|
|
|
key: train_fscore
|
|
value: [0.97916667 0.97916667 0.98429319 0.97916667 0.97916667 0.97382199
|
|
0.98947368 0.96875 0.97894737 0.97894737]
|
|
|
|
mean value: 0.9790900270965371
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.85714286 0.6 0.875 0.9 0.84615385
|
|
0.8 0.81818182 1. 0.9 ]
|
|
|
|
mean value: 0.83742562992563
|
|
|
|
key: train_precision
|
|
value: [0.96907216 0.96907216 0.97916667 0.96907216 0.96907216 0.95876289
|
|
0.97916667 0.94897959 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9679864471561821
|
|
|
|
key: test_recall
|
|
value: [0.7 0.6 0.9 0.7 0.9 1.
|
|
0.72727273 0.81818182 1. 0.81818182]
|
|
|
|
mean value: 0.8163636363636364
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.98947368 0.98947368 0.98947368 0.98947368 0.9893617
|
|
1. 0.9893617 0.9893617 0.9893617 ]
|
|
|
|
mean value: 0.990481522956327
|
|
|
|
key: test_roc_auc
|
|
value: [0.75909091 0.75454545 0.67727273 0.80454545 0.90454545 0.9
|
|
0.76363636 0.80909091 1. 0.85909091]
|
|
|
|
mean value: 0.8231818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [0.9787794 0.9787794 0.98409854 0.9787794 0.9787794 0.97362822
|
|
0.98947368 0.96836506 0.97889138 0.97889138]
|
|
|
|
mean value: 0.9788465845464726
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.54545455 0.5625 0.63636364 0.81818182 0.84615385
|
|
0.61538462 0.69230769 1. 0.75 ]
|
|
|
|
mean value: 0.7049679487179488
|
|
|
|
key: train_jcc
|
|
value: [0.95918367 0.95918367 0.96907216 0.95918367 0.95918367 0.94897959
|
|
0.97916667 0.93939394 0.95876289 0.95876289]
|
|
|
|
mean value: 0.9590872829919221
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01177764 0.01067424 0.01179218 0.01120353 0.01094937 0.01100659
|
|
0.01069164 0.01066709 0.01039362 0.01051068]
|
|
|
|
mean value: 0.010966658592224121
|
|
|
|
key: score_time
|
|
value: [0.01029491 0.00953531 0.00973797 0.01007605 0.01004219 0.00984097
|
|
0.01000714 0.01029038 0.009624 0.01030111]
|
|
|
|
mean value: 0.009975004196166991
|
|
|
|
key: test_mcc
|
|
value: [0.23373675 0.42817442 0.14545455 0.45226702 0.42817442 0.55161872
|
|
0.06741999 0.23636364 0.55161872 0.60302269]
|
|
|
|
mean value: 0.36978509083764477
|
|
|
|
key: train_mcc
|
|
value: [0.55646909 0.49995455 0.56569532 0.45906255 0.53448943 0.46424351
|
|
0.54251375 0.55585218 0.49572783 0.4861571 ]
|
|
|
|
mean value: 0.5160165310554382
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.66666667 0.57142857 0.71428571 0.66666667 0.76190476
|
|
0.52380952 0.61904762 0.76190476 0.76190476]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.76719577 0.74074074 0.77248677 0.71957672 0.75661376 0.72486772
|
|
0.76190476 0.77248677 0.73544974 0.73544974]
|
|
|
|
mean value: 0.7486772486772486
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.46153846 0.57142857 0.625 0.46153846 0.73684211
|
|
0.44444444 0.63636364 0.73684211 0.70588235]
|
|
|
|
mean value: 0.5935435694336623
|
|
|
|
key: train_fscore
|
|
value: [0.73170732 0.7030303 0.73939394 0.67484663 0.7195122 0.68292683
|
|
0.72392638 0.74556213 0.6835443 0.69512195]
|
|
|
|
mean value: 0.7099571975217122
|
|
|
|
key: test_precision
|
|
value: [0.625 1. 0.54545455 0.83333333 1. 0.875
|
|
0.57142857 0.63636364 0.875 1. ]
|
|
|
|
mean value: 0.7961580086580087
|
|
|
|
key: train_precision
|
|
value: [0.86956522 0.82857143 0.87142857 0.80882353 0.85507246 0.8
|
|
0.85507246 0.84 0.84375 0.81428571]
|
|
|
|
mean value: 0.8386569388625016
|
|
|
|
key: test_recall
|
|
value: [0.5 0.3 0.6 0.5 0.3 0.63636364
|
|
0.36363636 0.63636364 0.63636364 0.54545455]
|
|
|
|
mean value: 0.5018181818181818
|
|
|
|
key: train_recall
|
|
value: [0.63157895 0.61052632 0.64210526 0.57894737 0.62105263 0.59574468
|
|
0.62765957 0.67021277 0.57446809 0.60638298]
|
|
|
|
mean value: 0.6158678611422173
|
|
|
|
key: test_roc_auc
|
|
value: [0.61363636 0.65 0.57272727 0.70454545 0.65 0.76818182
|
|
0.53181818 0.61818182 0.76818182 0.77272727]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_roc_auc
|
|
value: [0.76791713 0.74143337 0.77318029 0.72032475 0.75733483 0.72418813
|
|
0.76119821 0.77194849 0.73460246 0.73477044]
|
|
|
|
mean value: 0.7486898096304592
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.3 0.4 0.45454545 0.3 0.58333333
|
|
0.28571429 0.46666667 0.58333333 0.54545455]
|
|
|
|
mean value: 0.43036630036630036
|
|
|
|
key: train_jcc
|
|
value: [0.57692308 0.54205607 0.58653846 0.50925926 0.56190476 0.51851852
|
|
0.56730769 0.59433962 0.51923077 0.53271028]
|
|
|
|
mean value: 0.5508788517464236
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [1.0007422 1.08978605 0.60352254 0.19705272 0.85835528 0.39761066
|
|
0.57176566 0.29660773 0.19528532 0.50309563]
|
|
|
|
mean value: 0.5713823795318603
|
|
|
|
key: score_time
|
|
value: [0.013587 0.01366115 0.0136435 0.01247692 0.01475835 0.01236415
|
|
0.01414704 0.01221514 0.01440883 0.01542115]
|
|
|
|
mean value: 0.013668322563171386
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 1. 0.39196475 0.80909091 0.90909091 1.
|
|
0.80909091 0.90829511 1. 0.71818182]
|
|
|
|
mean value: 0.8368467750842328
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 1. 0.66666667 0.9047619 0.95238095 1.
|
|
0.9047619 0.95238095 1. 0.85714286]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.72 0.9 0.95238095 1.
|
|
0.90909091 0.95652174 1. 0.85714286]
|
|
|
|
mean value: 0.9184025346634043
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.6 0.9 0.90909091 1.
|
|
0.90909091 0.91666667 1. 0.9 ]
|
|
|
|
mean value: 0.9134848484848485
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.9 0.9 1. 1.
|
|
0.90909091 1. 1. 0.81818182]
|
|
|
|
mean value: 0.9327272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 1. 0.67727273 0.90454545 0.95454545 1.
|
|
0.90454545 0.95 1. 0.85909091]
|
|
|
|
mean value: 0.915
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.5625 0.81818182 0.90909091 1.
|
|
0.83333333 0.91666667 1. 0.75 ]
|
|
|
|
mean value: 0.8589772727272728
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03846598 0.02921033 0.0438509 0.10852194 0.03383636 0.02990055
|
|
0.03976679 0.04133964 0.06952095 0.03217459]
|
|
|
|
mean value: 0.046658802032470706
|
|
|
|
key: score_time
|
|
value: [0.04556179 0.01182222 0.03038454 0.01200247 0.01077509 0.01119876
|
|
0.01961231 0.01819873 0.01630306 0.01722431]
|
|
|
|
mean value: 0.01930832862854004
|
|
|
|
key: test_mcc
|
|
value: [0.43007562 0.52727273 0.33028913 0.71818182 0.74161985 0.71818182
|
|
0.33028913 0.80909091 0.82572282 0.63305416]
|
|
|
|
mean value: 0.6063777984708973
|
|
|
|
key: train_mcc
|
|
value: [0.96830553 0.96874655 0.97905237 0.96830907 0.96830907 0.95767077
|
|
0.97883539 0.96830907 0.94713854 0.96830553]
|
|
|
|
mean value: 0.9672981891345079
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.76190476 0.66666667 0.85714286 0.85714286 0.85714286
|
|
0.66666667 0.9047619 0.9047619 0.80952381]
|
|
|
|
mean value: 0.7999999999999999
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.98412698 0.98941799 0.98412698 0.98412698 0.97883598
|
|
0.98941799 0.98412698 0.97354497 0.98412698]
|
|
|
|
mean value: 0.9835978835978836
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.76190476 0.63157895 0.85714286 0.82352941 0.85714286
|
|
0.69565217 0.90909091 0.9 0.8 ]
|
|
|
|
mean value: 0.7902708584994222
|
|
|
|
key: train_fscore
|
|
value: [0.98429319 0.98395722 0.98958333 0.98412698 0.98412698 0.9787234
|
|
0.9893617 0.98412698 0.97326203 0.98395722]
|
|
|
|
mean value: 0.9835519056402777
|
|
|
|
key: test_precision
|
|
value: [0.75 0.72727273 0.66666667 0.81818182 1. 0.9
|
|
0.66666667 0.90909091 1. 0.88888889]
|
|
|
|
mean value: 0.8326767676767677
|
|
|
|
key: train_precision
|
|
value: [0.97916667 1. 0.97938144 0.9893617 0.9893617 0.9787234
|
|
0.9893617 0.97894737 0.97849462 0.98924731]
|
|
|
|
mean value: 0.9852045924508858
|
|
|
|
key: test_recall
|
|
value: [0.6 0.8 0.6 0.9 0.7 0.81818182
|
|
0.72727273 0.90909091 0.81818182 0.72727273]
|
|
|
|
mean value: 0.76
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.96842105 1. 0.97894737 0.97894737 0.9787234
|
|
0.9893617 0.9893617 0.96808511 0.9787234 ]
|
|
|
|
mean value: 0.9820044792833147
|
|
|
|
key: test_roc_auc
|
|
value: [0.70909091 0.76363636 0.66363636 0.85909091 0.85 0.85909091
|
|
0.66363636 0.90454545 0.90909091 0.81363636]
|
|
|
|
mean value: 0.7995454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [0.98409854 0.98421053 0.9893617 0.98415454 0.98415454 0.97883539
|
|
0.98941769 0.98415454 0.97351624 0.98409854]
|
|
|
|
mean value: 0.9836002239641657
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.61538462 0.46153846 0.75 0.7 0.75
|
|
0.53333333 0.83333333 0.81818182 0.66666667]
|
|
|
|
mean value: 0.6628438228438228
|
|
|
|
key: train_jcc
|
|
value: [0.96907216 0.96842105 0.97938144 0.96875 0.96875 0.95833333
|
|
0.97894737 0.96875 0.94791667 0.96842105]
|
|
|
|
mean value: 0.9676743081931634
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01121068 0.01103139 0.01107359 0.01116538 0.0111959 0.0111444
|
|
0.01113462 0.0111568 0.01111794 0.01115346]
|
|
|
|
mean value: 0.011138415336608887
|
|
|
|
key: score_time
|
|
value: [0.0101912 0.01027894 0.01037192 0.01032901 0.01054072 0.01028752
|
|
0.01039696 0.01047206 0.01040912 0.01026344]
|
|
|
|
mean value: 0.010354089736938476
|
|
|
|
key: test_mcc
|
|
value: [0.33028913 0.05504819 0.39196475 0.43007562 0.44038551 0.63305416
|
|
0.05504819 0.24120908 0.52295779 0.23636364]
|
|
|
|
mean value: 0.3336396040864597
|
|
|
|
key: train_mcc
|
|
value: [0.43065616 0.50382186 0.43289183 0.41958895 0.45044462 0.43991059
|
|
0.42961362 0.49404873 0.47201413 0.46267525]
|
|
|
|
mean value: 0.45356657385003807
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.52380952 0.66666667 0.71428571 0.71428571 0.80952381
|
|
0.52380952 0.61904762 0.76190476 0.61904762]
|
|
|
|
mean value: 0.6619047619047619
|
|
|
|
key: train_accuracy
|
|
value: [0.71428571 0.75132275 0.71428571 0.70899471 0.72486772 0.71957672
|
|
0.71428571 0.74603175 0.73544974 0.73015873]
|
|
|
|
mean value: 0.7259259259259259
|
|
|
|
key: test_fscore
|
|
value: [0.63157895 0.54545455 0.72 0.66666667 0.72727273 0.8
|
|
0.5 0.69230769 0.7826087 0.63636364]
|
|
|
|
mean value: 0.6702252911085863
|
|
|
|
key: train_fscore
|
|
value: [0.73 0.76142132 0.73529412 0.72361809 0.73469388 0.7253886
|
|
0.72164948 0.75510204 0.74226804 0.74111675]
|
|
|
|
mean value: 0.7370552324342122
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.6 0.75 0.66666667 0.88888889
|
|
0.55555556 0.6 0.75 0.63636364]
|
|
|
|
mean value: 0.6614141414141415
|
|
|
|
key: train_precision
|
|
value: [0.6952381 0.73529412 0.68807339 0.69230769 0.71287129 0.70707071
|
|
0.7 0.7254902 0.72 0.70873786]
|
|
|
|
mean value: 0.7085083354043781
|
|
|
|
key: test_recall
|
|
value: [0.6 0.6 0.9 0.6 0.8 0.72727273
|
|
0.45454545 0.81818182 0.81818182 0.63636364]
|
|
|
|
mean value: 0.6954545454545454
|
|
|
|
key: train_recall
|
|
value: [0.76842105 0.78947368 0.78947368 0.75789474 0.75789474 0.74468085
|
|
0.74468085 0.78723404 0.76595745 0.77659574]
|
|
|
|
mean value: 0.7682306830907055
|
|
|
|
key: test_roc_auc
|
|
value: [0.66363636 0.52727273 0.67727273 0.70909091 0.71818182 0.81363636
|
|
0.52727273 0.60909091 0.75909091 0.61818182]
|
|
|
|
mean value: 0.6622727272727273
|
|
|
|
key: train_roc_auc
|
|
value: [0.71399776 0.75111982 0.71388578 0.7087346 0.72469205 0.71970885
|
|
0.71444569 0.7462486 0.7356103 0.73040314]
|
|
|
|
mean value: 0.7258846584546472
|
|
|
|
key: test_jcc
|
|
value: [0.46153846 0.375 0.5625 0.5 0.57142857 0.66666667
|
|
0.33333333 0.52941176 0.64285714 0.46666667]
|
|
|
|
mean value: 0.5109402607196725
|
|
|
|
key: train_jcc
|
|
value: [0.57480315 0.6147541 0.58139535 0.56692913 0.58064516 0.56910569
|
|
0.56451613 0.60655738 0.59016393 0.58870968]
|
|
|
|
mean value: 0.5837579700936688
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01644301 0.01346326 0.01460147 0.01373553 0.01718259 0.01639462
|
|
0.01637864 0.01489592 0.01578617 0.01493788]
|
|
|
|
mean value: 0.015381908416748047
|
|
|
|
key: score_time
|
|
value: [0.0110271 0.01079273 0.01092768 0.01038527 0.01035404 0.01038051
|
|
0.01045108 0.01053333 0.01052833 0.0103631 ]
|
|
|
|
mean value: 0.01057431697845459
|
|
|
|
key: test_mcc
|
|
value: [0.4719399 0.61818182 0.4719399 0.74161985 0.71562645 0.60302269
|
|
0.23373675 0.71562645 0.74161985 0.60302269]
|
|
|
|
mean value: 0.5916336343526927
|
|
|
|
key: train_mcc
|
|
value: [0.86125076 0.84693232 0.84923609 0.66188185 0.95789003 0.87061974
|
|
0.8157737 0.89595041 0.80682683 0.88957791]
|
|
|
|
mean value: 0.8455939633920262
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.80952381 0.71428571 0.85714286 0.85714286 0.76190476
|
|
0.61904762 0.85714286 0.85714286 0.76190476]
|
|
|
|
mean value: 0.780952380952381
|
|
|
|
key: train_accuracy
|
|
value: [0.92592593 0.92063492 0.92063492 0.8042328 0.97883598 0.93121693
|
|
0.8994709 0.94708995 0.89417989 0.94179894]
|
|
|
|
mean value: 0.9164021164021163
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.8 0.75 0.82352941 0.84210526 0.70588235
|
|
0.66666667 0.86956522 0.88 0.70588235]
|
|
|
|
mean value: 0.7793631264862925
|
|
|
|
key: train_fscore
|
|
value: [0.93137255 0.92537313 0.92610837 0.75816993 0.9787234 0.92571429
|
|
0.90821256 0.94505495 0.90384615 0.93785311]
|
|
|
|
mean value: 0.9140428448974536
|
|
|
|
key: test_precision
|
|
value: [0.64285714 0.8 0.64285714 1. 0.88888889 1.
|
|
0.61538462 0.83333333 0.78571429 1. ]
|
|
|
|
mean value: 0.8209035409035409
|
|
|
|
key: train_precision
|
|
value: [0.87155963 0.87735849 0.87037037 1. 0.98924731 1.
|
|
0.83185841 0.97727273 0.8245614 1. ]
|
|
|
|
mean value: 0.9242228343653033
|
|
|
|
key: test_recall
|
|
value: [0.9 0.8 0.9 0.7 0.8 0.54545455
|
|
0.72727273 0.90909091 1. 0.54545455]
|
|
|
|
mean value: 0.7827272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 0.97894737 0.98947368 0.61052632 0.96842105 0.86170213
|
|
1. 0.91489362 1. 0.88297872]
|
|
|
|
mean value: 0.9206942889137738
|
|
|
|
key: test_roc_auc
|
|
value: [0.72272727 0.80909091 0.72272727 0.85 0.85454545 0.77272727
|
|
0.61363636 0.85454545 0.85 0.77272727]
|
|
|
|
mean value: 0.7822727272727272
|
|
|
|
key: train_roc_auc
|
|
value: [0.92553191 0.92032475 0.92026876 0.80526316 0.97889138 0.93085106
|
|
0.9 0.94692049 0.89473684 0.94148936]
|
|
|
|
mean value: 0.916427771556551
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.66666667 0.6 0.7 0.72727273 0.54545455
|
|
0.5 0.76923077 0.78571429 0.54545455]
|
|
|
|
mean value: 0.6439793539793539
|
|
|
|
key: train_jcc
|
|
value: [0.87155963 0.86111111 0.86238532 0.61052632 0.95833333 0.86170213
|
|
0.83185841 0.89583333 0.8245614 0.88297872]
|
|
|
|
mean value: 0.846084970934794
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0150187 0.01364851 0.01349545 0.01297951 0.01252437 0.01285005
|
|
0.01290178 0.0133462 0.01365423 0.01284862]
|
|
|
|
mean value: 0.013326740264892578
|
|
|
|
key: score_time
|
|
value: [0.01063037 0.01029801 0.01048899 0.01053691 0.01031756 0.0104537
|
|
0.01043105 0.01037478 0.01037979 0.01044369]
|
|
|
|
mean value: 0.01043548583984375
|
|
|
|
key: test_mcc
|
|
value: [0.33709993 0.53935989 0.53935989 0.38924947 0.82572282 0.50874702
|
|
0.42727273 0.66332496 1. 0.52727273]
|
|
|
|
mean value: 0.5757409438784044
|
|
|
|
key: train_mcc
|
|
value: [0.80452249 0.84518345 0.79793785 0.47083798 0.80904214 0.61283493
|
|
0.85498064 0.88405964 0.88405964 0.89601922]
|
|
|
|
mean value: 0.7859477973314178
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.76190476 0.76190476 0.61904762 0.9047619 0.71428571
|
|
0.71428571 0.80952381 1. 0.76190476]
|
|
|
|
mean value: 0.7714285714285715
|
|
|
|
key: train_accuracy
|
|
value: [0.8994709 0.92063492 0.88888889 0.68253968 0.8994709 0.77248677
|
|
0.92592593 0.94179894 0.94179894 0.94708995]
|
|
|
|
mean value: 0.882010582010582
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.70588235 0.70588235 0.71428571 0.90909091 0.78571429
|
|
0.72727273 0.84615385 1. 0.76190476]
|
|
|
|
mean value: 0.7744422244422244
|
|
|
|
key: train_fscore
|
|
value: [0.89385475 0.91712707 0.87573964 0.76 0.90731707 0.81385281
|
|
0.92857143 0.94240838 0.94240838 0.94845361]
|
|
|
|
mean value: 0.892973314316607
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.85714286 0.85714286 0.55555556 0.83333333 0.64705882
|
|
0.72727273 0.73333333 1. 0.8 ]
|
|
|
|
mean value: 0.7725125201595789
|
|
|
|
key: train_precision
|
|
value: [0.95238095 0.96511628 1. 0.61290323 0.84545455 0.68613139
|
|
0.89215686 0.92783505 0.92783505 0.92 ]
|
|
|
|
mean value: 0.8729813355410913
|
|
|
|
key: test_recall
|
|
value: [0.5 0.6 0.6 1. 1. 1.
|
|
0.72727273 1. 1. 0.72727273]
|
|
|
|
mean value: 0.8154545454545454
|
|
|
|
key: train_recall
|
|
value: [0.84210526 0.87368421 0.77894737 1. 0.97894737 1.
|
|
0.96808511 0.95744681 0.95744681 0.9787234 ]
|
|
|
|
mean value: 0.933538633818589
|
|
|
|
key: test_roc_auc
|
|
value: [0.65909091 0.75454545 0.75454545 0.63636364 0.90909091 0.7
|
|
0.71363636 0.8 1. 0.76363636]
|
|
|
|
mean value: 0.769090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [0.89977604 0.92088466 0.88947368 0.68085106 0.89904815 0.77368421
|
|
0.92614782 0.9418813 0.9418813 0.94725644]
|
|
|
|
mean value: 0.8820884658454647
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.54545455 0.54545455 0.55555556 0.83333333 0.64705882
|
|
0.57142857 0.73333333 1. 0.61538462]
|
|
|
|
mean value: 0.6463669990140578
|
|
|
|
key: train_jcc
|
|
value: [0.80808081 0.84693878 0.77894737 0.61290323 0.83035714 0.68613139
|
|
0.86666667 0.89108911 0.89108911 0.90196078]
|
|
|
|
mean value: 0.8114164376339148
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14535904 0.14270997 0.1418457 0.14258361 0.14160824 0.14119768
|
|
0.14122701 0.15788913 0.1422658 0.14225483]
|
|
|
|
mean value: 0.14389410018920898
|
|
|
|
key: score_time
|
|
value: [0.01751041 0.0176034 0.01684165 0.01729679 0.01721644 0.01762438
|
|
0.01736569 0.01751566 0.01748872 0.01746202]
|
|
|
|
mean value: 0.017392516136169434
|
|
|
|
key: test_mcc
|
|
value: [0.71562645 0.90829511 0.26967994 0.71562645 0.90909091 0.80909091
|
|
0.71818182 0.71818182 1. 0.71562645]
|
|
|
|
mean value: 0.7479399847756402
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.95238095 0.61904762 0.85714286 0.95238095 0.9047619
|
|
0.85714286 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.94736842 0.66666667 0.84210526 0.95238095 0.90909091
|
|
0.85714286 0.85714286 1. 0.86956522]
|
|
|
|
mean value: 0.8743568407183968
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.57142857 0.88888889 0.90909091 0.90909091
|
|
0.9 0.9 1. 0.83333333]
|
|
|
|
mean value: 0.88007215007215
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 0.8 0.8 1. 0.90909091
|
|
0.81818182 0.81818182 1. 0.90909091]
|
|
|
|
mean value: 0.8754545454545455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85454545 0.95 0.62727273 0.85454545 0.95454545 0.90454545
|
|
0.85909091 0.85909091 1. 0.85454545]
|
|
|
|
mean value: 0.8718181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.9 0.5 0.72727273 0.90909091 0.83333333
|
|
0.75 0.75 1. 0.76923077]
|
|
|
|
mean value: 0.7866200466200466
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03950834 0.0361855 0.0394659 0.03514695 0.03398728 0.0323205
|
|
0.03702044 0.03738832 0.03468561 0.03310966]
|
|
|
|
mean value: 0.035881853103637694
|
|
|
|
key: score_time
|
|
value: [0.01812267 0.01860189 0.01861978 0.08073974 0.01802468 0.01793027
|
|
0.02523518 0.01894164 0.01871896 0.01882577]
|
|
|
|
mean value: 0.025376057624816893
|
|
|
|
key: test_mcc
|
|
value: [0.62641448 0.82275335 0.4719399 0.90829511 1. 0.90829511
|
|
0.80909091 0.90829511 0.82572282 0.90829511]
|
|
|
|
mean value: 0.8189101896090047
|
|
|
|
key: train_mcc
|
|
value: [0.97883539 0.97883539 0.98947368 0.98947368 1. 0.98947251
|
|
0.98947368 0.97905237 0.98947368 0.98947251]
|
|
|
|
mean value: 0.987356290106085
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.9047619 0.71428571 0.95238095 1. 0.95238095
|
|
0.9047619 0.95238095 0.9047619 0.95238095]
|
|
|
|
mean value: 0.9047619047619048
|
|
|
|
key: train_accuracy
|
|
value: [0.98941799 0.98941799 0.99470899 0.99470899 1. 0.99470899
|
|
0.99470899 0.98941799 0.99470899 0.99470899]
|
|
|
|
mean value: 0.9936507936507937
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.88888889 0.75 0.94736842 1. 0.95652174
|
|
0.90909091 0.95652174 0.9 0.95652174]
|
|
|
|
mean value: 0.9042691214201511
|
|
|
|
key: train_fscore
|
|
value: [0.98947368 0.98947368 0.99470899 0.99470899 1. 0.99465241
|
|
0.99470899 0.98924731 0.99470899 0.99465241]
|
|
|
|
mean value: 0.9936335471919213
|
|
|
|
key: test_precision
|
|
value: [0.875 1. 0.64285714 1. 1. 0.91666667
|
|
0.90909091 0.91666667 1. 0.91666667]
|
|
|
|
mean value: 0.9176948051948052
|
|
|
|
key: train_precision
|
|
value: [0.98947368 0.98947368 1. 1. 1. 1.
|
|
0.98947368 1. 0.98947368 1. ]
|
|
|
|
mean value: 0.9957894736842106
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.9 0.9 1. 1.
|
|
0.90909091 1. 0.81818182 1. ]
|
|
|
|
mean value: 0.9027272727272727
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.98947368 0.98947368 0.98947368 1. 0.9893617
|
|
1. 0.9787234 1. 0.9893617 ]
|
|
|
|
mean value: 0.9915341545352744
|
|
|
|
key: test_roc_auc
|
|
value: [0.80454545 0.9 0.72272727 0.95 1. 0.95
|
|
0.90454545 0.95 0.90909091 0.95 ]
|
|
|
|
mean value: 0.9040909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [0.98941769 0.98941769 0.99473684 0.99473684 1. 0.99468085
|
|
0.99473684 0.9893617 0.99473684 0.99468085]
|
|
|
|
mean value: 0.9936506159014558
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.8 0.6 0.9 1. 0.91666667
|
|
0.83333333 0.91666667 0.81818182 0.91666667]
|
|
|
|
mean value: 0.8337878787878787
|
|
|
|
key: train_jcc
|
|
value: [0.97916667 0.97916667 0.98947368 0.98947368 1. 0.9893617
|
|
0.98947368 0.9787234 0.98947368 0.9893617 ]
|
|
|
|
mean value: 0.9873674878686076
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08917499 0.13380027 0.17171621 0.11417508 0.11032367 0.10746312
|
|
0.12534523 0.12104058 0.1084094 0.1078198 ]
|
|
|
|
mean value: 0.11892683506011963
|
|
|
|
key: score_time
|
|
value: [0.03536677 0.02449703 0.02127337 0.01806617 0.02497578 0.02218723
|
|
0.02117968 0.02874351 0.02377129 0.03750396]
|
|
|
|
mean value: 0.025756478309631348
|
|
|
|
key: test_mcc
|
|
value: [0.03739788 0.53935989 0.52295779 0.23636364 0.62641448 0.33028913
|
|
0.18090681 0.52727273 0.39196475 0.55161872]
|
|
|
|
mean value: 0.3944545813288632
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.52380952 0.76190476 0.76190476 0.61904762 0.80952381 0.66666667
|
|
0.57142857 0.76190476 0.66666667 0.76190476]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.44444444 0.70588235 0.73684211 0.6 0.77777778 0.69565217
|
|
0.47058824 0.76190476 0.58823529 0.73684211]
|
|
|
|
mean value: 0.6518169250919285
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.85714286 0.77777778 0.6 0.875 0.66666667
|
|
0.66666667 0.8 0.83333333 0.875 ]
|
|
|
|
mean value: 0.7451587301587301
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.4 0.6 0.7 0.6 0.7 0.72727273
|
|
0.36363636 0.72727273 0.45454545 0.63636364]
|
|
|
|
mean value: 0.5909090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.51818182 0.75454545 0.75909091 0.61818182 0.80454545 0.66363636
|
|
0.58181818 0.76363636 0.67727273 0.76818182]
|
|
|
|
mean value: 0.6909090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.28571429 0.54545455 0.58333333 0.42857143 0.63636364 0.53333333
|
|
0.30769231 0.61538462 0.41666667 0.58333333]
|
|
|
|
mean value: 0.4935847485847486
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.46492267 0.45418954 0.45839477 0.45019031 0.45531964 0.45568514
|
|
0.44885039 0.45692539 0.45565152 0.44814992]
|
|
|
|
mean value: 0.45482792854309084
|
|
|
|
key: score_time
|
|
value: [0.01082897 0.01071215 0.01069975 0.0107615 0.01069736 0.01068664
|
|
0.01067924 0.01079154 0.01067519 0.01112723]
|
|
|
|
mean value: 0.010765957832336425
|
|
|
|
key: test_mcc
|
|
value: [0.90829511 1. 0.4719399 0.90909091 0.90909091 0.90829511
|
|
0.80909091 0.90829511 1. 0.90829511]
|
|
|
|
mean value: 0.8732393055913986
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95238095 1. 0.71428571 0.95238095 0.95238095 0.95238095
|
|
0.9047619 0.95238095 1. 0.95238095]
|
|
|
|
mean value: 0.9333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 1. 0.75 0.95238095 0.95238095 0.95652174
|
|
0.90909091 0.95652174 1. 0.95652174]
|
|
|
|
mean value: 0.938078645229675
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.64285714 0.90909091 0.90909091 0.91666667
|
|
0.90909091 0.91666667 1. 0.91666667]
|
|
|
|
mean value: 0.9120129870129869
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 0.9 1. 1. 1.
|
|
0.90909091 1. 1. 1. ]
|
|
|
|
mean value: 0.9709090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95 1. 0.72272727 0.95454545 0.95454545 0.95
|
|
0.90454545 0.95 1. 0.95 ]
|
|
|
|
mean value: 0.9336363636363636
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9 1. 0.6 0.90909091 0.90909091 0.91666667
|
|
0.83333333 0.91666667 1. 0.91666667]
|
|
|
|
mean value: 0.8901515151515151
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.07232475 0.05349302 0.05705237 0.06523085 0.04446149 0.04642963
|
|
0.0278461 0.02402425 0.02402234 0.0231905 ]
|
|
|
|
mean value: 0.04380753040313721
|
|
|
|
key: score_time
|
|
value: [0.02672768 0.02717161 0.02339268 0.01948118 0.02111459 0.0229249
|
|
0.01305509 0.01254392 0.01267433 0.01560426]
|
|
|
|
mean value: 0.019469022750854492
|
|
|
|
key: test_mcc
|
|
value: [0.55161872 0.74795759 0.74795759 0.55161872 0.82572282 0.90829511
|
|
0.74161985 0.66332496 0.90829511 0.80909091]
|
|
|
|
mean value: 0.7455501384398332
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.85714286 0.85714286 0.76190476 0.9047619 0.95238095
|
|
0.85714286 0.80952381 0.95238095 0.9047619 ]
|
|
|
|
mean value: 0.8619047619047618
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.86956522 0.86956522 0.7826087 0.90909091 0.95652174
|
|
0.88 0.84615385 0.95652174 0.90909091]
|
|
|
|
mean value: 0.876172696868349
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.76923077 0.76923077 0.69230769 0.83333333 0.91666667
|
|
0.78571429 0.73333333 0.91666667 0.90909091]
|
|
|
|
mean value: 0.8017882117882118
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 1. 1. 0.9 1. 1.
|
|
1. 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9709090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.76818182 0.86363636 0.86363636 0.76818182 0.90909091 0.95
|
|
0.85 0.8 0.95 0.90454545]
|
|
|
|
mean value: 0.8627272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.76923077 0.76923077 0.64285714 0.83333333 0.91666667
|
|
0.78571429 0.73333333 0.91666667 0.83333333]
|
|
|
|
mean value: 0.7843223443223444
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.047019 0.05411768 0.05120516 0.08179045 0.03637123 0.03757405
|
|
0.0525198 0.0249393 0.07093072 0.05749607]
|
|
|
|
mean value: 0.05139634609222412
|
|
|
|
key: score_time
|
|
value: [0.06786156 0.01952934 0.03211212 0.01617074 0.04544282 0.0289197
|
|
0.03419495 0.01245379 0.02793956 0.02250385]
|
|
|
|
mean value: 0.03071284294128418
|
|
|
|
key: test_mcc
|
|
value: [0.52295779 0.80909091 0.44038551 1. 0.80909091 0.82572282
|
|
0.55161872 0.82275335 1. 0.63305416]
|
|
|
|
mean value: 0.7414674176772243
|
|
|
|
key: train_mcc
|
|
value: [0.97883539 0.95789003 0.94713854 0.9264031 0.94714446 0.94757483
|
|
0.96830553 0.94757483 0.93650616 0.93736014]
|
|
|
|
mean value: 0.9494732992955751
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.9047619 0.71428571 1. 0.9047619 0.9047619
|
|
0.76190476 0.9047619 1. 0.80952381]
|
|
|
|
mean value: 0.8666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.98941799 0.97883598 0.97354497 0.96296296 0.97354497 0.97354497
|
|
0.98412698 0.97354497 0.96825397 0.96825397]
|
|
|
|
mean value: 0.9746031746031746
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.9 0.72727273 1. 0.9 0.9
|
|
0.73684211 0.91666667 1. 0.8 ]
|
|
|
|
mean value: 0.861762360446571
|
|
|
|
key: train_fscore
|
|
value: [0.98947368 0.9787234 0.97382199 0.96256684 0.97354497 0.97382199
|
|
0.98395722 0.97382199 0.96808511 0.96875 ]
|
|
|
|
mean value: 0.9746567201151308
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.9 0.66666667 1. 0.9 1.
|
|
0.875 0.84615385 1. 0.88888889]
|
|
|
|
mean value: 0.8854487179487179
|
|
|
|
key: train_precision
|
|
value: [0.98947368 0.98924731 0.96875 0.97826087 0.9787234 0.95876289
|
|
0.98924731 0.95876289 0.96808511 0.94897959]
|
|
|
|
mean value: 0.9728293053102567
|
|
|
|
key: test_recall
|
|
value: [0.7 0.9 0.8 1. 0.9 0.81818182
|
|
0.63636364 1. 1. 0.72727273]
|
|
|
|
mean value: 0.8481818181818181
|
|
|
|
key: train_recall
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:128: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:131: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98947368 0.96842105 0.97894737 0.94736842 0.96842105 0.9893617
|
|
0.9787234 0.9893617 0.96808511 0.9893617 ]
|
|
|
|
mean value: 0.9767525195968645
|
|
|
|
key: test_roc_auc
|
|
value: [0.75909091 0.90454545 0.71818182 1. 0.90454545 0.90909091
|
|
0.76818182 0.9 1. 0.81363636]
|
|
|
|
mean value: 0.8677272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [0.98941769 0.97889138 0.97351624 0.96304591 0.97357223 0.97362822
|
|
0.98409854 0.97362822 0.96825308 0.96836506]
|
|
|
|
mean value: 0.9746416573348264
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.81818182 0.57142857 1. 0.81818182 0.81818182
|
|
0.58333333 0.84615385 1. 0.66666667]
|
|
|
|
mean value: 0.7705461205461206
|
|
|
|
key: train_jcc
|
|
value: [0.97916667 0.95833333 0.94897959 0.92783505 0.94845361 0.94897959
|
|
0.96842105 0.94897959 0.93814433 0.93939394]
|
|
|
|
mean value: 0.9506686757226445
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.24432898 0.38443637 0.46159816 0.37851357 0.30649304 0.50042582
|
|
0.33015323 0.36040902 0.35314488 0.34496379]
|
|
|
|
mean value: 0.36644668579101564
|
|
|
|
key: score_time
|
|
value: [0.02193141 0.03222609 0.03092313 0.02275586 0.02286959 0.02013755
|
|
0.02346706 0.01971507 0.01932216 0.02431846]
|
|
|
|
mean value: 0.023766636848449707
|
|
|
|
key: test_mcc
|
|
value: [0.62641448 0.71562645 0.52295779 1. 0.80909091 0.52727273
|
|
0.42727273 0.82275335 1. 0.63305416]
|
|
|
|
mean value: 0.7084442598864285
|
|
|
|
key: train_mcc
|
|
value: [0.98947251 0.94714446 0.95767077 0.93650616 0.93650616 0.95789003
|
|
0.95789003 0.94757483 0.93650616 0.93736014]
|
|
|
|
mean value: 0.9504521239520874
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.85714286 0.76190476 1. 0.9047619 0.76190476
|
|
0.71428571 0.9047619 1. 0.80952381]
|
|
|
|
mean value: 0.8523809523809524
|
|
|
|
key: train_accuracy
|
|
value: [0.99470899 0.97354497 0.97883598 0.96825397 0.96825397 0.97883598
|
|
0.97883598 0.97354497 0.96825397 0.96825397]
|
|
|
|
mean value: 0.9751322751322751
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.84210526 0.73684211 1. 0.9 0.76190476
|
|
0.72727273 0.91666667 1. 0.8 ]
|
|
|
|
mean value: 0.8462569302042986
|
|
|
|
key: train_fscore
|
|
value: [0.9947644 0.97354497 0.97894737 0.96842105 0.96842105 0.97894737
|
|
0.97894737 0.97382199 0.96808511 0.96875 ]
|
|
|
|
mean value: 0.9752650677888823
|
|
|
|
key: test_precision
|
|
value: [0.875 0.88888889 0.77777778 1. 0.9 0.8
|
|
0.72727273 0.84615385 1. 0.88888889]
|
|
|
|
mean value: 0.8703982128982128
|
|
|
|
key: train_precision
|
|
value: [0.98958333 0.9787234 0.97894737 0.96842105 0.96842105 0.96875
|
|
0.96875 0.95876289 0.96808511 0.94897959]
|
|
|
|
mean value: 0.9697423796090515
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.7 1. 0.9 0.72727273
|
|
0.72727273 1. 1. 0.72727273]
|
|
|
|
mean value: 0.8281818181818181
|
|
|
|
key: train_recall
|
|
value: [1. 0.96842105 0.97894737 0.96842105 0.96842105 0.9893617
|
|
0.9893617 0.9893617 0.96808511 0.9893617 ]
|
|
|
|
mean value: 0.9809742441209407
|
|
|
|
key: test_roc_auc
|
|
value: [0.80454545 0.85454545 0.75909091 1. 0.90454545 0.76363636
|
|
0.71363636 0.9 1. 0.81363636]
|
|
|
|
mean value: 0.8513636363636363
|
|
|
|
key: train_roc_auc
|
|
value: [0.99468085 0.97357223 0.97883539 0.96825308 0.96825308 0.97889138
|
|
0.97889138 0.97362822 0.96825308 0.96836506]
|
|
|
|
mean value: 0.9751623740201567
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.72727273 0.58333333 1. 0.81818182 0.61538462
|
|
0.57142857 0.84615385 1. 0.66666667]
|
|
|
|
mean value: 0.7464785214785215
|
|
|
|
key: train_jcc
|
|
value: [0.98958333 0.94845361 0.95876289 0.93877551 0.93877551 0.95876289
|
|
0.95876289 0.94897959 0.93814433 0.93939394]
|
|
|
|
mean value: 0.9518394482910315
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03008127 0.03638411 0.03371334 0.03461933 0.088166 0.07241488
|
|
0.03378487 0.06745028 0.06802797 0.06680107]
|
|
|
|
mean value: 0.053144311904907225
|
|
|
|
key: score_time
|
|
value: [0.01519585 0.01250982 0.01437712 0.01443815 0.02278161 0.01534963
|
|
0.01221824 0.02406025 0.01457262 0.01535487]
|
|
|
|
mean value: 0.0160858154296875
|
|
|
|
key: test_mcc
|
|
value: [0.33028913 0.62641448 0.44038551 0.61818182 0.82275335 0.80909091
|
|
0.55161872 0.80909091 0.42727273 0.71818182]
|
|
|
|
mean value: 0.6153279376024733
|
|
|
|
key: train_mcc
|
|
value: [0.8738236 0.8314659 0.83068309 0.862486 0.8738236 0.87319373
|
|
0.90480458 0.88402082 0.86284197 0.89438907]
|
|
|
|
mean value: 0.8691532351249575
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.80952381 0.71428571 0.80952381 0.9047619 0.9047619
|
|
0.76190476 0.9047619 0.71428571 0.85714286]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: train_accuracy
|
|
value: [0.93650794 0.91534392 0.91534392 0.93121693 0.93650794 0.93650794
|
|
0.95238095 0.94179894 0.93121693 0.94708995]
|
|
|
|
mean value: 0.9343915343915343
|
|
|
|
key: test_fscore
|
|
value: [0.63157895 0.77777778 0.72727273 0.8 0.88888889 0.90909091
|
|
0.73684211 0.90909091 0.72727273 0.85714286]
|
|
|
|
mean value: 0.7964957849168376
|
|
|
|
key: train_fscore
|
|
value: [0.93548387 0.91397849 0.91578947 0.93121693 0.93548387 0.93548387
|
|
0.95187166 0.94054054 0.92972973 0.94736842]
|
|
|
|
mean value: 0.9336946861504937
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.875 0.66666667 0.8 1. 0.90909091
|
|
0.875 0.90909091 0.72727273 0.9 ]
|
|
|
|
mean value: 0.8328787878787879
|
|
|
|
key: train_precision
|
|
value: [0.95604396 0.93406593 0.91578947 0.93617021 0.95604396 0.94565217
|
|
0.95698925 0.95604396 0.94505495 0.9375 ]
|
|
|
|
mean value: 0.9439353854927787
|
|
|
|
key: test_recall
|
|
value: [0.6 0.7 0.8 0.8 0.8 0.90909091
|
|
0.63636364 0.90909091 0.72727273 0.81818182]
|
|
|
|
mean value: 0.77
|
|
|
|
key: train_recall
|
|
value: [0.91578947 0.89473684 0.91578947 0.92631579 0.91578947 0.92553191
|
|
0.94680851 0.92553191 0.91489362 0.95744681]
|
|
|
|
mean value: 0.9238633818589026
|
|
|
|
key: test_roc_auc
|
|
value: [0.66363636 0.80454545 0.71818182 0.80909091 0.9 0.90454545
|
|
0.76818182 0.90454545 0.71363636 0.85909091]
|
|
|
|
mean value: 0.8045454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [0.93661814 0.91545353 0.91534155 0.931243 0.93661814 0.93645017
|
|
0.95235162 0.94171333 0.93113102 0.94714446]
|
|
|
|
mean value: 0.9344064949608063
|
|
|
|
key: test_jcc
|
|
value: [0.46153846 0.63636364 0.57142857 0.66666667 0.8 0.83333333
|
|
0.58333333 0.83333333 0.57142857 0.75 ]
|
|
|
|
mean value: 0.6707425907425908
|
|
|
|
key: train_jcc
|
|
value: [0.87878788 0.84158416 0.84466019 0.87128713 0.87878788 0.87878788
|
|
0.90816327 0.8877551 0.86868687 0.9 ]
|
|
|
|
mean value: 0.8758500353700914
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.30685544 1.22656345 2.2850945 1.44909978 0.99897575 0.93024874
|
|
0.82502961 1.04818726 1.14439416 0.94249988]
|
|
|
|
mean value: 1.2156948566436767
|
|
|
|
key: score_time
|
|
value: [0.05694318 0.05938172 0.03290272 0.0146718 0.01458669 0.01460767
|
|
0.01712728 0.01470828 0.01468945 0.01212597]
|
|
|
|
mean value: 0.025174474716186522
|
|
|
|
key: test_mcc
|
|
value: [0.61818182 0.71562645 0.52295779 1. 0.90829511 0.74795759
|
|
0.55161872 1. 0.71818182 0.67419986]
|
|
|
|
mean value: 0.7457019156935036
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.85714286 0.76190476 1. 0.95238095 0.85714286
|
|
0.76190476 1. 0.85714286 0.80952381]
|
|
|
|
mean value: 0.8666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99470899 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994708994708995
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.84210526 0.73684211 1. 0.94736842 0.84210526
|
|
0.73684211 1. 0.85714286 0.77777778]
|
|
|
|
mean value: 0.8540183792815372
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99470899 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994708994708995
|
|
|
|
key: test_precision
|
|
value: [0.8 0.88888889 0.77777778 1. 1. 1.
|
|
0.875 1. 0.9 1. ]
|
|
|
|
mean value: 0.9241666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 0.7 1. 0.9 0.72727273
|
|
0.63636364 1. 0.81818182 0.63636364]
|
|
|
|
mean value: 0.8018181818181819
|
|
|
|
key: train_recall
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
key: test_roc_auc
|
|
value: [0.80909091 0.85454545 0.75909091 1. 0.95 0.86363636
|
|
0.76818182 1. 0.85909091 0.81818182]
|
|
|
|
mean value: 0.8681818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99473684 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994736842105263
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.72727273 0.58333333 1. 0.9 0.72727273
|
|
0.58333333 1. 0.75 0.63636364]
|
|
|
|
mean value: 0.7574242424242424
|
|
|
|
key: train_jcc
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01341915 0.00960445 0.009269 0.00893736 0.00891161 0.0090003
|
|
0.00897193 0.00950503 0.00953388 0.01000404]
|
|
|
|
mean value: 0.009715676307678223
|
|
|
|
key: score_time
|
|
value: [0.01628661 0.01169658 0.00924897 0.00879931 0.00866485 0.00861454
|
|
0.00882459 0.00885248 0.00950265 0.0094347 ]
|
|
|
|
mean value: 0.009992527961730956
|
|
|
|
key: test_mcc
|
|
value: [ 0.35527986 -0.23373675 0.11677484 0.23373675 0.39196475 0.50874702
|
|
0.42727273 0.33709993 0.15569979 0.80909091]
|
|
|
|
mean value: 0.3101929821275265
|
|
|
|
key: train_mcc
|
|
value: [0.4223863 0.42057994 0.42563559 0.42871542 0.43824416 0.39871188
|
|
0.49053012 0.44175632 0.4436004 0.39053852]
|
|
|
|
mean value: 0.4300698660601598
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.38095238 0.52380952 0.61904762 0.66666667 0.71428571
|
|
0.71428571 0.66666667 0.57142857 0.9047619 ]
|
|
|
|
mean value: 0.6428571428571428
|
|
|
|
key: train_accuracy
|
|
value: [0.6984127 0.69312169 0.7037037 0.71428571 0.70899471 0.68783069
|
|
0.73544974 0.7037037 0.69312169 0.67724868]
|
|
|
|
mean value: 0.7015873015873015
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 0.43478261 0.64285714 0.55555556 0.72 0.78571429
|
|
0.72727273 0.72 0.68965517 0.90909091]
|
|
|
|
mean value: 0.688058057551311
|
|
|
|
key: train_fscore
|
|
value: [0.74439462 0.74561404 0.74311927 0.71276596 0.74885845 0.73059361
|
|
0.76635514 0.75 0.75213675 0.73127753]
|
|
|
|
mean value: 0.7425115357581491
|
|
|
|
key: test_precision
|
|
value: [0.61538462 0.38461538 0.5 0.625 0.6 0.64705882
|
|
0.72727273 0.64285714 0.55555556 0.90909091]
|
|
|
|
mean value: 0.6206835158305747
|
|
|
|
key: train_precision
|
|
value: [0.6484375 0.63909774 0.65853659 0.72043011 0.66129032 0.64
|
|
0.68333333 0.64615385 0.62857143 0.62406015]
|
|
|
|
mean value: 0.654991101826883
|
|
|
|
key: test_recall
|
|
value: [0.8 0.5 0.9 0.5 0.9 1.
|
|
0.72727273 0.81818182 0.90909091 0.90909091]
|
|
|
|
mean value: 0.7963636363636364
|
|
|
|
key: train_recall
|
|
value: [0.87368421 0.89473684 0.85263158 0.70526316 0.86315789 0.85106383
|
|
0.87234043 0.89361702 0.93617021 0.88297872]
|
|
|
|
mean value: 0.8625643896976484
|
|
|
|
key: test_roc_auc
|
|
value: [0.67272727 0.38636364 0.54090909 0.61363636 0.67727273 0.7
|
|
0.71363636 0.65909091 0.55454545 0.90454545]
|
|
|
|
mean value: 0.6422727272727272
|
|
|
|
key: train_roc_auc
|
|
value: [0.6974804 0.69204927 0.70291153 0.71433371 0.70817469 0.68868981
|
|
0.73617021 0.70470325 0.6944009 0.67833147]
|
|
|
|
mean value: 0.7017245240761478
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 0.27777778 0.47368421 0.38461538 0.5625 0.64705882
|
|
0.57142857 0.5625 0.52631579 0.83333333]
|
|
|
|
mean value: 0.5372547224017812
|
|
|
|
key: train_jcc
|
|
value: [0.59285714 0.59440559 0.59124088 0.55371901 0.59854015 0.57553957
|
|
0.62121212 0.6 0.60273973 0.57638889]
|
|
|
|
mean value: 0.5906643071898742
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00971127 0.00953364 0.00909424 0.00915623 0.00902915 0.00920463
|
|
0.00920725 0.00904489 0.00909853 0.00923586]
|
|
|
|
mean value: 0.0092315673828125
|
|
|
|
key: score_time
|
|
value: [0.00908184 0.00872087 0.00866318 0.00864768 0.0086987 0.00893164
|
|
0.00869703 0.00877166 0.00874424 0.00874138]
|
|
|
|
mean value: 0.008769822120666505
|
|
|
|
key: test_mcc
|
|
value: [ 0.24771685 -0.08528029 0.26967994 0.43007562 0.45226702 0.24771685
|
|
0.15894099 0.33028913 0.13762047 0.30914104]
|
|
|
|
mean value: 0.24981676122162266
|
|
|
|
key: train_mcc
|
|
value: [0.47383838 0.43945337 0.49511046 0.41111248 0.4606251 0.46213311
|
|
0.45044462 0.43913092 0.44972004 0.47093091]
|
|
|
|
mean value: 0.45524993919694523
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.47619048 0.61904762 0.71428571 0.71428571 0.61904762
|
|
0.57142857 0.66666667 0.57142857 0.61904762]
|
|
|
|
mean value: 0.6190476190476191
|
|
|
|
key: train_accuracy
|
|
value: [0.73544974 0.71957672 0.74603175 0.7037037 0.73015873 0.73015873
|
|
0.72486772 0.71957672 0.72486772 0.73544974]
|
|
|
|
mean value: 0.726984126984127
|
|
|
|
key: test_fscore
|
|
value: [0.63636364 0.26666667 0.66666667 0.66666667 0.625 0.6
|
|
0.52631579 0.69565217 0.60869565 0.5 ]
|
|
|
|
mean value: 0.5792027251924277
|
|
|
|
key: train_fscore
|
|
value: [0.72222222 0.71657754 0.73333333 0.68539326 0.72727273 0.7150838
|
|
0.71428571 0.71657754 0.72340426 0.7311828 ]
|
|
|
|
mean value: 0.7185333185655622
|
|
|
|
key: test_precision
|
|
value: [0.58333333 0.4 0.57142857 0.75 0.83333333 0.66666667
|
|
0.625 0.66666667 0.58333333 0.8 ]
|
|
|
|
mean value: 0.6479761904761905
|
|
|
|
key: train_precision
|
|
value: [0.76470588 0.72826087 0.77647059 0.73493976 0.73913043 0.75294118
|
|
0.73863636 0.72043011 0.72340426 0.73913043]
|
|
|
|
mean value: 0.7418049871707797
|
|
|
|
key: test_recall
|
|
value: [0.7 0.2 0.8 0.6 0.5 0.54545455
|
|
0.45454545 0.72727273 0.63636364 0.36363636]
|
|
|
|
mean value: 0.5527272727272727
|
|
|
|
key: train_recall
|
|
value: [0.68421053 0.70526316 0.69473684 0.64210526 0.71578947 0.68085106
|
|
0.69148936 0.71276596 0.72340426 0.72340426]
|
|
|
|
mean value: 0.6974020156774916
|
|
|
|
key: test_roc_auc
|
|
value: [0.62272727 0.46363636 0.62727273 0.70909091 0.70454545 0.62272727
|
|
0.57727273 0.66363636 0.56818182 0.63181818]
|
|
|
|
mean value: 0.6190909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [0.73572228 0.71965286 0.74630459 0.70403135 0.73023516 0.72989922
|
|
0.72469205 0.71954087 0.72486002 0.73538634]
|
|
|
|
mean value: 0.7270324748040313
|
|
|
|
key: test_jcc
|
|
value: [0.46666667 0.15384615 0.5 0.5 0.45454545 0.42857143
|
|
0.35714286 0.53333333 0.4375 0.33333333]
|
|
|
|
mean value: 0.4164939227439227
|
|
|
|
key: train_jcc
|
|
value: [0.56521739 0.55833333 0.57894737 0.52136752 0.57142857 0.55652174
|
|
0.55555556 0.55833333 0.56666667 0.57627119]
|
|
|
|
mean value: 0.5608642666981495
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00931072 0.00956655 0.00860357 0.00860739 0.00866055 0.00952816
|
|
0.00955248 0.00887895 0.00881219 0.00958371]
|
|
|
|
mean value: 0.009110426902770996
|
|
|
|
key: score_time
|
|
value: [0.01484823 0.0155077 0.01463413 0.014539 0.01421666 0.01497364
|
|
0.01540303 0.01495123 0.01449418 0.01488686]
|
|
|
|
mean value: 0.014845466613769532
|
|
|
|
key: test_mcc
|
|
value: [-0.23636364 0.24120908 -0.14545455 0.42727273 -0.15894099 0.04545455
|
|
0.08528029 0.13762047 0.15894099 0.08528029]
|
|
|
|
mean value: 0.06402992102965852
|
|
|
|
key: train_mcc
|
|
value: [0.50296855 0.39710991 0.42871542 0.39710991 0.3454297 0.51484568
|
|
0.49237699 0.43913092 0.44975918 0.39915366]
|
|
|
|
mean value: 0.43665999514803383
|
|
|
|
key: test_accuracy
|
|
value: [0.38095238 0.61904762 0.42857143 0.71428571 0.42857143 0.52380952
|
|
0.52380952 0.57142857 0.57142857 0.52380952]
|
|
|
|
mean value: 0.5285714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.75132275 0.6984127 0.71428571 0.6984127 0.67195767 0.75661376
|
|
0.74603175 0.71957672 0.72486772 0.6984127 ]
|
|
|
|
mean value: 0.717989417989418
|
|
|
|
key: test_fscore
|
|
value: [0.38095238 0.5 0.4 0.7 0.33333333 0.54545455
|
|
0.375 0.60869565 0.52631579 0.375 ]
|
|
|
|
mean value: 0.4744751701387857
|
|
|
|
key: train_fscore
|
|
value: [0.7486631 0.69518717 0.71276596 0.69518717 0.65934066 0.74444444
|
|
0.73913043 0.71657754 0.72043011 0.6779661 ]
|
|
|
|
mean value: 0.710969267849835
|
|
|
|
key: test_precision
|
|
value: [0.36363636 0.66666667 0.4 0.7 0.375 0.54545455
|
|
0.6 0.58333333 0.625 0.6 ]
|
|
|
|
mean value: 0.5459090909090909
|
|
|
|
key: train_precision
|
|
value: [0.76086957 0.70652174 0.72043011 0.70652174 0.68965517 0.77906977
|
|
0.75555556 0.72043011 0.72826087 0.72289157]
|
|
|
|
mean value: 0.7290206189773512
|
|
|
|
key: test_recall
|
|
value: [0.4 0.4 0.4 0.7 0.3 0.54545455
|
|
0.27272727 0.63636364 0.45454545 0.27272727]
|
|
|
|
mean value: 0.4381818181818182
|
|
|
|
key: train_recall
|
|
value: [0.73684211 0.68421053 0.70526316 0.68421053 0.63157895 0.71276596
|
|
0.72340426 0.71276596 0.71276596 0.63829787]
|
|
|
|
mean value: 0.6942105263157895
|
|
|
|
key: test_roc_auc
|
|
value: [0.38181818 0.60909091 0.42727273 0.71363636 0.42272727 0.52272727
|
|
0.53636364 0.56818182 0.57727273 0.53636364]
|
|
|
|
mean value: 0.5295454545454545
|
|
|
|
key: train_roc_auc
|
|
value: [0.75139978 0.69848824 0.71433371 0.69848824 0.67217245 0.75638298
|
|
0.74591265 0.71954087 0.72480403 0.6980963 ]
|
|
|
|
mean value: 0.7179619260918253
|
|
|
|
key: test_jcc
|
|
value: [0.23529412 0.33333333 0.25 0.53846154 0.2 0.375
|
|
0.23076923 0.4375 0.35714286 0.23076923]
|
|
|
|
mean value: 0.31882703081232494
|
|
|
|
key: train_jcc
|
|
value: [0.5982906 0.53278689 0.55371901 0.53278689 0.49180328 0.59292035
|
|
0.5862069 0.55833333 0.56302521 0.51282051]
|
|
|
|
mean value: 0.5522692962507294
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01352286 0.01665878 0.01201367 0.01270628 0.01311684 0.0131042
|
|
0.01292396 0.01178217 0.01296258 0.01138806]
|
|
|
|
mean value: 0.013017940521240234
|
|
|
|
key: score_time
|
|
value: [0.011729 0.00963879 0.01002765 0.01018095 0.01015759 0.01019979
|
|
0.01010227 0.00988412 0.01014733 0.00937819]
|
|
|
|
mean value: 0.010144567489624024
|
|
|
|
key: test_mcc
|
|
value: [0.26967994 0.03739788 0.30914104 0.61818182 0.33636364 0.42727273
|
|
0.24771685 0.52295779 0.33709993 0.82572282]
|
|
|
|
mean value: 0.3931534435784296
|
|
|
|
key: train_mcc
|
|
value: [0.73576888 0.75666293 0.74744848 0.75694773 0.73585755 0.75666293
|
|
0.80967855 0.82013664 0.80967855 0.8102023 ]
|
|
|
|
mean value: 0.7739044547190053
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.52380952 0.61904762 0.80952381 0.66666667 0.71428571
|
|
0.61904762 0.76190476 0.66666667 0.9047619 ]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: train_accuracy
|
|
value: [0.86772487 0.87830688 0.87301587 0.87830688 0.86772487 0.87830688
|
|
0.9047619 0.91005291 0.9047619 0.9047619 ]
|
|
|
|
mean value: 0.8867724867724868
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.44444444 0.69230769 0.8 0.66666667 0.72727273
|
|
0.6 0.7826087 0.72 0.9 ]
|
|
|
|
mean value: 0.6999966893010371
|
|
|
|
key: train_fscore
|
|
value: [0.87046632 0.87830688 0.87755102 0.88082902 0.86631016 0.87830688
|
|
0.90322581 0.90909091 0.90322581 0.90217391]
|
|
|
|
mean value: 0.8869486709274905
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.5 0.5625 0.8 0.63636364 0.72727273
|
|
0.66666667 0.75 0.64285714 1. ]
|
|
|
|
mean value: 0.6857088744588744
|
|
|
|
key: train_precision
|
|
value: [0.85714286 0.88297872 0.85148515 0.86734694 0.88043478 0.87368421
|
|
0.91304348 0.91397849 0.91304348 0.92222222]
|
|
|
|
mean value: 0.8875360334340103
|
|
|
|
key: test_recall
|
|
value: [0.8 0.4 0.9 0.8 0.7 0.72727273
|
|
0.54545455 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.7327272727272728
|
|
|
|
key: train_recall
|
|
value: [0.88421053 0.87368421 0.90526316 0.89473684 0.85263158 0.88297872
|
|
0.89361702 0.90425532 0.89361702 0.88297872]
|
|
|
|
mean value: 0.8867973124300113
|
|
|
|
key: test_roc_auc
|
|
value: [0.62727273 0.51818182 0.63181818 0.80909091 0.66818182 0.71363636
|
|
0.62272727 0.75909091 0.65909091 0.90909091]
|
|
|
|
mean value: 0.6918181818181819
|
|
|
|
key: train_roc_auc
|
|
value: [0.86763718 0.87833147 0.87284434 0.87821948 0.86780515 0.87833147
|
|
0.90470325 0.9100224 0.90470325 0.90464726]
|
|
|
|
mean value: 0.8867245240761478
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.28571429 0.52941176 0.66666667 0.5 0.57142857
|
|
0.42857143 0.64285714 0.5625 0.81818182]
|
|
|
|
mean value: 0.5505331678125795
|
|
|
|
key: train_jcc
|
|
value: [0.7706422 0.78301887 0.78181818 0.78703704 0.76415094 0.78301887
|
|
0.82352941 0.83333333 0.82352941 0.82178218]
|
|
|
|
mean value: 0.7971860435015932
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.80780959 1.64683247 2.04292989 2.00777054 2.20077896 2.11120558
|
|
1.75354791 1.55821252 1.74798012 1.62602258]
|
|
|
|
mean value: 1.7503090143203734
|
|
|
|
key: score_time
|
|
value: [0.01250052 0.03408003 0.01242089 0.0147202 0.02710557 0.03695774
|
|
0.02797961 0.02157736 0.03396821 0.04308009]
|
|
|
|
mean value: 0.0264390230178833
|
|
|
|
key: test_mcc
|
|
value: [0.23636364 0.62641448 0.63305416 0.71562645 0.71562645 0.60302269
|
|
0.4719399 0.90909091 0.52727273 0.82572282]
|
|
|
|
mean value: 0.6264134232369432
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.80952381 0.80952381 0.85714286 0.85714286 0.76190476
|
|
0.71428571 0.95238095 0.76190476 0.9047619 ]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.77777778 0.81818182 0.84210526 0.84210526 0.70588235
|
|
0.66666667 0.95238095 0.76190476 0.9 ]
|
|
|
|
mean value: 0.7867004856168943
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 0.875 0.75 0.88888889 0.88888889 1.
|
|
0.85714286 1. 0.8 1. ]
|
|
|
|
mean value: 0.8659920634920635
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.7 0.9 0.8 0.8 0.54545455
|
|
0.54545455 0.90909091 0.72727273 0.81818182]
|
|
|
|
mean value: 0.7345454545454545
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.80454545 0.81363636 0.85454545 0.85454545 0.77272727
|
|
0.72272727 0.95454545 0.76363636 0.90909091]
|
|
|
|
mean value: 0.8068181818181819
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.63636364 0.69230769 0.72727273 0.72727273 0.54545455
|
|
0.5 0.90909091 0.61538462 0.81818182]
|
|
|
|
mean value: 0.65999000999001
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03607321 0.01854372 0.01873636 0.01854372 0.01878786 0.01843524
|
|
0.0188539 0.0193491 0.01886439 0.01850581]
|
|
|
|
mean value: 0.02046933174133301
|
|
|
|
key: score_time
|
|
value: [0.0126884 0.0127058 0.01279783 0.0127821 0.01274991 0.01255703
|
|
0.01291847 0.01300454 0.01293087 0.01302671]
|
|
|
|
mean value: 0.012816166877746582
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.82275335 0.71818182 1. 1. 0.90909091
|
|
0.52727273 1. 0.82572282 0.90909091]
|
|
|
|
mean value: 0.8534865889896018
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.9047619 0.85714286 1. 1. 0.95238095
|
|
0.76190476 1. 0.9047619 0.95238095]
|
|
|
|
mean value: 0.9238095238095237
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.85714286 1. 1. 0.95238095
|
|
0.76190476 1. 0.9 0.95238095]
|
|
|
|
mean value: 0.9201587301587302
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.81818182 1. 1. 1.
|
|
0.8 1. 1. 1. ]
|
|
|
|
mean value: 0.9618181818181818
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 0.9 1. 1. 0.90909091
|
|
0.72727273 1. 0.81818182 0.90909091]
|
|
|
|
mean value: 0.8863636363636364
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.9 0.85909091 1. 1. 0.95454545
|
|
0.76363636 1. 0.90909091 0.95454545]
|
|
|
|
mean value: 0.9240909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.75 1. 1. 0.90909091
|
|
0.61538462 1. 0.81818182 0.90909091]
|
|
|
|
mean value: 0.8601748251748251
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14116931 0.14187193 0.14279318 0.14259577 0.14345264 0.14271784
|
|
0.14100552 0.1444006 0.28073263 0.1378026 ]
|
|
|
|
mean value: 0.1558542013168335
|
|
|
|
key: score_time
|
|
value: [0.02478743 0.02508497 0.02502012 0.02513218 0.02525902 0.02499199
|
|
0.02501416 0.02670145 0.0441525 0.02399564]
|
|
|
|
mean value: 0.027013945579528808
|
|
|
|
key: test_mcc
|
|
value: [0.33636364 0.33028913 0.63305416 0.71562645 1. 0.90909091
|
|
0.55161872 0.90909091 0.90829511 0.90909091]
|
|
|
|
mean value: 0.7202519935788301
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.66666667 0.80952381 0.85714286 1. 0.95238095
|
|
0.76190476 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.63157895 0.81818182 0.84210526 1. 0.95238095
|
|
0.73684211 0.95238095 0.95652174 0.95238095]
|
|
|
|
mean value: 0.850903939691125
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.66666667 0.75 0.88888889 1. 1.
|
|
0.875 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.8733585858585858
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.7 0.6 0.9 0.8 1. 0.90909091
|
|
0.63636364 0.90909091 1. 0.90909091]
|
|
|
|
mean value: 0.8363636363636363
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.66818182 0.66363636 0.81363636 0.85454545 1. 0.95454545
|
|
0.76818182 0.95454545 0.95 0.95454545]
|
|
|
|
mean value: 0.8581818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.46153846 0.69230769 0.72727273 1. 0.90909091
|
|
0.58333333 0.90909091 0.91666667 0.90909091]
|
|
|
|
mean value: 0.7608391608391608
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01380777 0.01287603 0.01335287 0.01323295 0.01333666 0.0131979
|
|
0.01330137 0.01331997 0.01724935 0.01322055]
|
|
|
|
mean value: 0.013689541816711425
|
|
|
|
key: score_time
|
|
value: [0.01226544 0.01245737 0.01221156 0.01228142 0.01906157 0.01224637
|
|
0.01229739 0.01220536 0.02101064 0.01225805]
|
|
|
|
mean value: 0.013829517364501952
|
|
|
|
key: test_mcc
|
|
value: [0.23636364 0.33636364 0.33028913 0.62641448 0.62641448 0.71818182
|
|
0.35527986 0.82572282 0.63305416 0.52727273]
|
|
|
|
mean value: 0.52153567593267
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.66666667 0.66666667 0.80952381 0.80952381 0.85714286
|
|
0.66666667 0.9047619 0.80952381 0.76190476]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.66666667 0.63157895 0.77777778 0.77777778 0.85714286
|
|
0.63157895 0.9 0.8 0.76190476]
|
|
|
|
mean value: 0.7404427736006683
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 0.63636364 0.66666667 0.875 0.875 0.9
|
|
0.75 1. 0.88888889 0.8 ]
|
|
|
|
mean value: 0.7991919191919192
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.7 0.6 0.7 0.7 0.81818182
|
|
0.54545455 0.81818182 0.72727273 0.72727273]
|
|
|
|
mean value: 0.6936363636363636
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.66818182 0.66363636 0.80454545 0.80454545 0.85909091
|
|
0.67272727 0.90909091 0.81363636 0.76363636]
|
|
|
|
mean value: 0.7577272727272728
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.5 0.46153846 0.63636364 0.63636364 0.75
|
|
0.46153846 0.81818182 0.66666667 0.61538462]
|
|
|
|
mean value: 0.5974608724608724
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.89492655 1.78066468 1.30577731 1.62279463 1.80648351 2.03026009
|
|
1.99937129 1.9573288 1.32665277 1.25374627]
|
|
|
|
mean value: 1.697800588607788
|
|
|
|
key: score_time
|
|
value: [0.13693285 0.12314796 0.09956264 0.12987232 0.12343669 0.12388182
|
|
0.15492225 0.12438631 0.09179854 0.09004569]
|
|
|
|
mean value: 0.11979870796203614
|
|
|
|
key: test_mcc
|
|
value: [0.23636364 0.58630197 0.63305416 0.80909091 1. 1.
|
|
0.55161872 0.90909091 0.90909091 0.90909091]
|
|
|
|
mean value: 0.7543702131757848
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.76190476 0.80952381 0.9047619 1. 1.
|
|
0.76190476 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.66666667 0.81818182 0.9 1. 1.
|
|
0.73684211 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.8578833447254499
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 1. 0.75 0.9 1. 1. 0.875 1. 1. 1. ]
|
|
|
|
mean value: 0.9125
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.9 0.9 1. 1.
|
|
0.63636364 0.90909091 0.90909091 0.90909091]
|
|
|
|
mean value: 0.8263636363636364
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.75 0.81363636 0.90454545 1. 1.
|
|
0.76818182 0.95454545 0.95454545 0.95454545]
|
|
|
|
mean value: 0.8718181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.42857143 0.5 0.69230769 0.81818182 1. 1.
|
|
0.58333333 0.90909091 0.90909091 0.90909091]
|
|
|
|
mean value: 0.7749666999667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.88374734 0.898525 0.88748789 0.90992236 0.87181115 0.90699387
|
|
0.91898417 0.8755424 0.88821411 0.94180608]
|
|
|
|
mean value: 0.8983034372329712
|
|
|
|
key: score_time
|
|
value: [0.13070011 0.15602064 0.2069819 0.17687368 0.21706963 0.24573994
|
|
0.12457681 0.16810298 0.19927859 0.16002679]
|
|
|
|
mean value: 0.17853710651397706
|
|
|
|
key: test_mcc
|
|
value: [0.52295779 0.24120908 0.4719399 0.62641448 0.90829511 0.90829511
|
|
0.63305416 0.90909091 0.71562645 0.82572282]
|
|
|
|
mean value: 0.6762605808801134
|
|
|
|
key: train_mcc
|
|
value: [0.95767077 0.95767077 0.95788064 0.95767077 0.96830553 0.95789003
|
|
0.98947368 0.96830907 0.96830907 0.96830907]
|
|
|
|
mean value: 0.9651489409652466
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.61904762 0.71428571 0.80952381 0.95238095 0.95238095
|
|
0.80952381 0.95238095 0.85714286 0.9047619 ]
|
|
|
|
mean value: 0.8333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.97883598 0.97883598 0.97883598 0.97883598 0.98412698 0.97883598
|
|
0.99470899 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9825396825396825
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.5 0.75 0.77777778 0.94736842 0.95652174
|
|
0.8 0.95238095 0.86956522 0.9 ]
|
|
|
|
mean value: 0.8190456212996259
|
|
|
|
key: train_fscore
|
|
value: [0.97894737 0.97894737 0.97916667 0.97894737 0.98429319 0.97894737
|
|
0.99470899 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9826339281158102
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.66666667 0.64285714 0.875 1. 0.91666667
|
|
0.88888889 1. 0.83333333 1. ]
|
|
|
|
mean value: 0.8601190476190477
|
|
|
|
key: train_precision
|
|
value: [0.97894737 0.97894737 0.96907216 0.97894737 0.97916667 0.96875
|
|
0.98947368 0.97894737 0.97894737 0.97894737]
|
|
|
|
mean value: 0.9780146726351963
|
|
|
|
key: test_recall
|
|
value: [0.7 0.4 0.9 0.7 0.9 1.
|
|
0.72727273 0.90909091 0.90909091 0.81818182]
|
|
|
|
mean value: 0.7963636363636364
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.97894737 0.98947368 0.97894737 0.98947368 0.9893617
|
|
1. 0.9893617 0.9893617 0.9893617 ]
|
|
|
|
mean value: 0.9873236282194849
|
|
|
|
key: test_roc_auc
|
|
value: [0.75909091 0.60909091 0.72272727 0.80454545 0.95 0.95
|
|
0.81363636 0.95454545 0.85454545 0.90909091]
|
|
|
|
mean value: 0.8327272727272728
|
|
|
|
key: train_roc_auc
|
|
value: [0.97883539 0.97883539 0.9787794 0.97883539 0.98409854 0.97889138
|
|
0.99473684 0.98415454 0.98415454 0.98415454]
|
|
|
|
mean value: 0.9825475923852184
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.33333333 0.6 0.63636364 0.9 0.91666667
|
|
0.66666667 0.90909091 0.76923077 0.81818182]
|
|
|
|
mean value: 0.7132867132867133
|
|
|
|
key: train_jcc
|
|
value: [0.95876289 0.95876289 0.95918367 0.95876289 0.96907216 0.95876289
|
|
0.98947368 0.96875 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9659031069020121
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02410841 0.00991416 0.01022744 0.01032209 0.00959754 0.01036572
|
|
0.00976348 0.00933981 0.01035428 0.01001096]
|
|
|
|
mean value: 0.011400389671325683
|
|
|
|
key: score_time
|
|
value: [0.00971651 0.00905752 0.00934982 0.00952864 0.00984573 0.00897074
|
|
0.00895166 0.00975013 0.00975347 0.00914025]
|
|
|
|
mean value: 0.009406447410583496
|
|
|
|
key: test_mcc
|
|
value: [ 0.24771685 -0.08528029 0.26967994 0.43007562 0.45226702 0.24771685
|
|
0.15894099 0.33028913 0.13762047 0.30914104]
|
|
|
|
mean value: 0.24981676122162266
|
|
|
|
key: train_mcc
|
|
value: [0.47383838 0.43945337 0.49511046 0.41111248 0.4606251 0.46213311
|
|
0.45044462 0.43913092 0.44972004 0.47093091]
|
|
|
|
mean value: 0.45524993919694523
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.47619048 0.61904762 0.71428571 0.71428571 0.61904762
|
|
0.57142857 0.66666667 0.57142857 0.61904762]
|
|
|
|
mean value: 0.6190476190476191
|
|
|
|
key: train_accuracy
|
|
value: [0.73544974 0.71957672 0.74603175 0.7037037 0.73015873 0.73015873
|
|
0.72486772 0.71957672 0.72486772 0.73544974]
|
|
|
|
mean value: 0.726984126984127
|
|
|
|
key: test_fscore
|
|
value: [0.63636364 0.26666667 0.66666667 0.66666667 0.625 0.6
|
|
0.52631579 0.69565217 0.60869565 0.5 ]
|
|
|
|
mean value: 0.5792027251924277
|
|
|
|
key: train_fscore
|
|
value: [0.72222222 0.71657754 0.73333333 0.68539326 0.72727273 0.7150838
|
|
0.71428571 0.71657754 0.72340426 0.7311828 ]
|
|
|
|
mean value: 0.7185333185655622
|
|
|
|
key: test_precision
|
|
value: [0.58333333 0.4 0.57142857 0.75 0.83333333 0.66666667
|
|
0.625 0.66666667 0.58333333 0.8 ]
|
|
|
|
mean value: 0.6479761904761905
|
|
|
|
key: train_precision
|
|
value: [0.76470588 0.72826087 0.77647059 0.73493976 0.73913043 0.75294118
|
|
0.73863636 0.72043011 0.72340426 0.73913043]
|
|
|
|
mean value: 0.7418049871707797
|
|
|
|
key: test_recall
|
|
value: [0.7 0.2 0.8 0.6 0.5 0.54545455
|
|
0.45454545 0.72727273 0.63636364 0.36363636]
|
|
|
|
mean value: 0.5527272727272727
|
|
|
|
key: train_recall
|
|
value: [0.68421053 0.70526316 0.69473684 0.64210526 0.71578947 0.68085106
|
|
0.69148936 0.71276596 0.72340426 0.72340426]
|
|
|
|
mean value: 0.6974020156774916
|
|
|
|
key: test_roc_auc
|
|
value: [0.62272727 0.46363636 0.62727273 0.70909091 0.70454545 0.62272727
|
|
0.57727273 0.66363636 0.56818182 0.63181818]
|
|
|
|
mean value: 0.6190909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [0.73572228 0.71965286 0.74630459 0.70403135 0.73023516 0.72989922
|
|
0.72469205 0.71954087 0.72486002 0.73538634]
|
|
|
|
mean value: 0.7270324748040313
|
|
|
|
key: test_jcc
|
|
value: [0.46666667 0.15384615 0.5 0.5 0.45454545 0.42857143
|
|
0.35714286 0.53333333 0.4375 0.33333333]
|
|
|
|
mean value: 0.4164939227439227
|
|
|
|
key: train_jcc
|
|
value: [0.56521739 0.55833333 0.57894737 0.52136752 0.57142857 0.55652174
|
|
0.55555556 0.55833333 0.56666667 0.57627119]
|
|
|
|
mean value: 0.5608642666981495
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [1.47359419 1.51395917 1.40394354 1.49806833 0.73248506 0.14855814
|
|
0.12012458 1.29104447 0.43357348 0.96943855]
|
|
|
|
mean value: 0.9584789514541626
|
|
|
|
key: score_time
|
|
value: [0.01303411 0.01369548 0.01240373 0.02015805 0.01262212 0.01179838
|
|
0.01309061 0.01261091 0.01320601 0.01363063]
|
|
|
|
mean value: 0.013625001907348633
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.90829511 0.63305416 1. 1. 1.
|
|
0.71562645 1. 1. 0.82572282]
|
|
|
|
mean value: 0.8905451893561251
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.95238095 0.80952381 1. 1. 1.
|
|
0.85714286 1. 1. 0.9047619 ]
|
|
|
|
mean value: 0.9428571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.94736842 0.81818182 1. 1. 1.
|
|
0.86956522 1. 1. 0.9 ]
|
|
|
|
mean value: 0.9424004345514643
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.75 1. 1. 1.
|
|
0.83333333 1. 1. 1. ]
|
|
|
|
mean value: 0.9583333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 0.9 1. 1. 1.
|
|
0.90909091 1. 1. 0.81818182]
|
|
|
|
mean value: 0.9327272727272727
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.95 0.81363636 1. 1. 1.
|
|
0.85454545 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9427272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.9 0.69230769 1. 1. 1.
|
|
0.76923077 1. 1. 0.81818182]
|
|
|
|
mean value: 0.897972027972028
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04121399 0.07725215 0.07548904 0.03828049 0.06962919 0.06137753
|
|
0.04697084 0.05984068 0.04247427 0.06800699]
|
|
|
|
mean value: 0.05805351734161377
|
|
|
|
key: score_time
|
|
value: [0.01933718 0.0345602 0.01221371 0.01211905 0.0219357 0.01210523
|
|
0.02289724 0.01555729 0.02384901 0.02143526]
|
|
|
|
mean value: 0.019600987434387207
|
|
|
|
key: test_mcc
|
|
value: [0.53935989 0.44038551 0.53935989 0.80909091 0.74161985 0.82572282
|
|
0.67419986 0.90909091 0.82572282 0.53300179]
|
|
|
|
mean value: 0.6837554253924925
|
|
|
|
key: train_mcc
|
|
value: [0.97883539 0.94757483 0.98947251 0.97905701 0.97905701 0.96830553
|
|
0.95788064 0.95788064 0.92637852 0.95788064]
|
|
|
|
mean value: 0.9642322713040267
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.71428571 0.76190476 0.9047619 0.85714286 0.9047619
|
|
0.80952381 0.95238095 0.9047619 0.71428571]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.98941799 0.97354497 0.99470899 0.98941799 0.98941799 0.98412698
|
|
0.97883598 0.97883598 0.96296296 0.97883598]
|
|
|
|
mean value: 0.982010582010582
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.72727273 0.70588235 0.9 0.82352941 0.9
|
|
0.77777778 0.95238095 0.9 0.625 ]
|
|
|
|
mean value: 0.8017725575078516
|
|
|
|
key: train_fscore
|
|
value: [0.98947368 0.97326203 0.9947644 0.9893617 0.9893617 0.98395722
|
|
0.97849462 0.97849462 0.96216216 0.97849462]
|
|
|
|
mean value: 0.9817826770838407
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.66666667 0.85714286 0.9 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9280952380952381
|
|
|
|
key: train_precision
|
|
value: [0.98947368 0.98913043 0.98958333 1. 1. 0.98924731
|
|
0.98913043 0.98913043 0.97802198 0.98913043]
|
|
|
|
mean value: 0.990284804652423
|
|
|
|
key: test_recall
|
|
value: [0.6 0.8 0.6 0.9 0.7 0.81818182
|
|
0.63636364 0.90909091 0.81818182 0.45454545]
|
|
|
|
mean value: 0.7236363636363636
|
|
|
|
key: train_recall
|
|
value: [0.98947368 0.95789474 1. 0.97894737 0.97894737 0.9787234
|
|
0.96808511 0.96808511 0.94680851 0.96808511]
|
|
|
|
mean value: 0.973505039193729
|
|
|
|
key: test_roc_auc
|
|
value: [0.75454545 0.71818182 0.75454545 0.90454545 0.85 0.90909091
|
|
0.81818182 0.95454545 0.90909091 0.72727273]
|
|
|
|
mean value: 0.83
|
|
|
|
key: train_roc_auc
|
|
value: [0.98941769 0.97362822 0.99468085 0.98947368 0.98947368 0.98409854
|
|
0.9787794 0.9787794 0.96287794 0.9787794 ]
|
|
|
|
mean value: 0.9819988801791713
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.57142857 0.54545455 0.81818182 0.7 0.81818182
|
|
0.63636364 0.90909091 0.81818182 0.45454545]
|
|
|
|
mean value: 0.6816883116883117
|
|
|
|
key: train_jcc
|
|
value: [0.97916667 0.94791667 0.98958333 0.97894737 0.97894737 0.96842105
|
|
0.95789474 0.95789474 0.92708333 0.95789474]
|
|
|
|
mean value: 0.964375
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02042866 0.01330423 0.01539516 0.01460958 0.01472521 0.0147078
|
|
0.01172614 0.01205611 0.00887656 0.00874543]
|
|
|
|
mean value: 0.013457489013671876
|
|
|
|
key: score_time
|
|
value: [0.02030158 0.01348376 0.01388836 0.01446819 0.01431298 0.01398635
|
|
0.01170921 0.00890064 0.00833917 0.00834632]
|
|
|
|
mean value: 0.012773656845092773
|
|
|
|
key: test_mcc
|
|
value: [ 0.33028913 -0.05504819 0.08528029 0.42727273 0.55161872 0.62641448
|
|
0.23636364 0.24120908 0.02312486 0.55161872]
|
|
|
|
mean value: 0.30181434631411985
|
|
|
|
key: train_mcc
|
|
value: [0.39005594 0.40281841 0.35974476 0.37994444 0.43065616 0.40240809
|
|
0.44248737 0.47825095 0.44248737 0.39243141]
|
|
|
|
mean value: 0.41212849003171664
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.47619048 0.52380952 0.71428571 0.76190476 0.80952381
|
|
0.61904762 0.61904762 0.52380952 0.76190476]
|
|
|
|
mean value: 0.6476190476190476
|
|
|
|
key: train_accuracy
|
|
value: [0.69312169 0.6984127 0.67724868 0.68783069 0.71428571 0.6984127
|
|
0.71957672 0.73544974 0.71957672 0.69312169]
|
|
|
|
mean value: 0.7037037037037037
|
|
|
|
key: test_fscore
|
|
value: [0.63157895 0.42105263 0.61538462 0.7 0.7826087 0.83333333
|
|
0.63636364 0.69230769 0.64285714 0.73684211]
|
|
|
|
mean value: 0.669232880010912
|
|
|
|
key: train_fscore
|
|
value: [0.71568627 0.72463768 0.70531401 0.71219512 0.73 0.71921182
|
|
0.73366834 0.75490196 0.73366834 0.71568627]
|
|
|
|
mean value: 0.7244969828653581
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.44444444 0.5 0.7 0.69230769 0.76923077
|
|
0.63636364 0.6 0.52941176 0.875 ]
|
|
|
|
mean value: 0.6413424973719091
|
|
|
|
key: train_precision
|
|
value: [0.66972477 0.66964286 0.65178571 0.66363636 0.6952381 0.66972477
|
|
0.6952381 0.7 0.6952381 0.66363636]
|
|
|
|
mean value: 0.6773865125699988
|
|
|
|
key: test_recall
|
|
value: [0.6 0.4 0.8 0.7 0.9 0.90909091
|
|
0.63636364 0.81818182 0.81818182 0.63636364]
|
|
|
|
mean value: 0.7218181818181818
|
|
|
|
key: train_recall
|
|
value: [0.76842105 0.78947368 0.76842105 0.76842105 0.76842105 0.77659574
|
|
0.77659574 0.81914894 0.77659574 0.77659574]
|
|
|
|
mean value: 0.7788689809630459
|
|
|
|
key: test_roc_auc
|
|
value: [0.66363636 0.47272727 0.53636364 0.71363636 0.76818182 0.80454545
|
|
0.61818182 0.60909091 0.50909091 0.76818182]
|
|
|
|
mean value: 0.6463636363636364
|
|
|
|
key: train_roc_auc
|
|
value: [0.69272116 0.69792833 0.67676372 0.68740202 0.71399776 0.69882419
|
|
0.71987682 0.73589026 0.71987682 0.69356103]
|
|
|
|
mean value: 0.7036842105263158
|
|
|
|
key: test_jcc
|
|
value: [0.46153846 0.26666667 0.44444444 0.53846154 0.64285714 0.71428571
|
|
0.46666667 0.52941176 0.47368421 0.58333333]
|
|
|
|
mean value: 0.5121349943486166
|
|
|
|
key: train_jcc
|
|
value: [0.55725191 0.56818182 0.54477612 0.5530303 0.57480315 0.56153846
|
|
0.57936508 0.60629921 0.57936508 0.55725191]
|
|
|
|
mean value: 0.5681863039882344
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01249909 0.01643276 0.01513815 0.01626801 0.01667428 0.01826024
|
|
0.01673746 0.01813006 0.0181365 0.01722217]
|
|
|
|
mean value: 0.01654987335205078
|
|
|
|
key: score_time
|
|
value: [0.00913095 0.01083755 0.01164222 0.01167583 0.01165843 0.01161766
|
|
0.01240993 0.01225019 0.01175737 0.01168919]
|
|
|
|
mean value: 0.01146693229675293
|
|
|
|
key: test_mcc
|
|
value: [0.30914104 0.45226702 0.71562645 0.67419986 0.58630197 0.60302269
|
|
0.55161872 0.82275335 0.74795759 0.67419986]
|
|
|
|
mean value: 0.6137088554293553
|
|
|
|
key: train_mcc
|
|
value: [0.80642655 0.76291765 0.83076702 0.58655527 0.76291765 0.82445214
|
|
0.94713854 0.93736014 0.91553719 0.93837953]
|
|
|
|
mean value: 0.8312451676284541
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.71428571 0.85714286 0.80952381 0.76190476 0.76190476
|
|
0.76190476 0.9047619 0.85714286 0.80952381]
|
|
|
|
mean value: 0.7857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.89417989 0.86772487 0.91005291 0.75661376 0.86772487 0.9047619
|
|
0.97354497 0.96825397 0.95767196 0.96825397]
|
|
|
|
mean value: 0.9068783068783068
|
|
|
|
key: test_fscore
|
|
value: [0.69230769 0.625 0.84210526 0.83333333 0.66666667 0.70588235
|
|
0.73684211 0.91666667 0.84210526 0.77777778]
|
|
|
|
mean value: 0.7638687121272261
|
|
|
|
key: train_fscore
|
|
value: [0.9047619 0.84848485 0.90285714 0.80508475 0.84848485 0.89411765
|
|
0.97326203 0.96875 0.95698925 0.96703297]
|
|
|
|
mean value: 0.9069825383840636
|
|
|
|
key: test_precision
|
|
value: [0.5625 0.83333333 0.88888889 0.71428571 1. 1.
|
|
0.875 0.84615385 1. 1. ]
|
|
|
|
mean value: 0.8720161782661783
|
|
|
|
key: train_precision
|
|
value: [0.82608696 1. 0.9875 0.67375887 1. 1.
|
|
0.97849462 0.94897959 0.9673913 1. ]
|
|
|
|
mean value: 0.9382211341610441
|
|
|
|
key: test_recall
|
|
value: [0.9 0.5 0.8 1. 0.5 0.54545455
|
|
0.63636364 1. 0.72727273 0.63636364]
|
|
|
|
mean value: 0.7245454545454546
|
|
|
|
key: train_recall
|
|
value: [1. 0.73684211 0.83157895 1. 0.73684211 0.80851064
|
|
0.96808511 0.9893617 0.94680851 0.93617021]
|
|
|
|
mean value: 0.8954199328107503
|
|
|
|
key: test_roc_auc
|
|
value: [0.63181818 0.70454545 0.85454545 0.81818182 0.75 0.77272727
|
|
0.76818182 0.9 0.86363636 0.81818182]
|
|
|
|
mean value: 0.7881818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [0.89361702 0.86842105 0.91047032 0.75531915 0.86842105 0.90425532
|
|
0.97351624 0.96836506 0.95761478 0.96808511]
|
|
|
|
mean value: 0.9068085106382979
|
|
|
|
key: test_jcc
|
|
value: [0.52941176 0.45454545 0.72727273 0.71428571 0.5 0.54545455
|
|
0.58333333 0.84615385 0.72727273 0.63636364]
|
|
|
|
mean value: 0.6264093749387867
|
|
|
|
key: train_jcc
|
|
value: [0.82608696 0.73684211 0.82291667 0.67375887 0.73684211 0.80851064
|
|
0.94791667 0.93939394 0.91752577 0.93617021]
|
|
|
|
mean value: 0.834596392928326
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01460838 0.01574397 0.01721883 0.01524639 0.03197598 0.01554775
|
|
0.01560473 0.01410699 0.01534986 0.01558042]
|
|
|
|
mean value: 0.017098331451416017
|
|
|
|
key: score_time
|
|
value: [0.01211929 0.01337838 0.01466918 0.01597857 0.02657938 0.01167512
|
|
0.01169777 0.0117774 0.01175189 0.01166606]
|
|
|
|
mean value: 0.014129304885864257
|
|
|
|
key: test_mcc
|
|
value: [0.26967994 0.53300179 0.62641448 0.58630197 0.90829511 0.38924947
|
|
0.55161872 0.50874702 0.80909091 0.62641448]
|
|
|
|
mean value: 0.580881390303866
|
|
|
|
key: train_mcc
|
|
value: [0.82785245 0.48764459 0.69501809 0.53983361 0.93841972 0.41041408
|
|
0.87061974 0.48948681 0.79048128 0.63728115]
|
|
|
|
mean value: 0.668705151901458
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.71428571 0.80952381 0.76190476 0.95238095 0.61904762
|
|
0.76190476 0.71428571 0.9047619 0.80952381]
|
|
|
|
mean value: 0.7666666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.91005291 0.69312169 0.82539683 0.72486772 0.96825397 0.64550265
|
|
0.93121693 0.6984127 0.88888889 0.78835979]
|
|
|
|
mean value: 0.8074074074074074
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.76923077 0.77777778 0.66666667 0.94736842 0.42857143
|
|
0.73684211 0.78571429 0.90909091 0.83333333]
|
|
|
|
mean value: 0.7521262363367627
|
|
|
|
key: train_fscore
|
|
value: [0.91625616 0.76612903 0.78980892 0.62318841 0.9673913 0.44628099
|
|
0.92571429 0.7654321 0.87719298 0.8245614 ]
|
|
|
|
mean value: 0.790195557941608
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.625 0.875 1. 1. 1.
|
|
0.875 0.64705882 0.90909091 0.76923077]
|
|
|
|
mean value: 0.8271809073279661
|
|
|
|
key: train_precision
|
|
value: [0.86111111 0.62091503 1. 1. 1. 1.
|
|
1. 0.62416107 0.97402597 0.70149254]
|
|
|
|
mean value: 0.878170572895576
|
|
|
|
key: test_recall
|
|
value: [0.8 1. 0.7 0.5 0.9 0.27272727
|
|
0.63636364 1. 0.90909091 0.90909091]
|
|
|
|
mean value: 0.7627272727272727
|
|
|
|
key: train_recall
|
|
value: [0.97894737 1. 0.65263158 0.45263158 0.93684211 0.28723404
|
|
0.86170213 0.9893617 0.79787234 1. ]
|
|
|
|
mean value: 0.7957222844344904
|
|
|
|
key: test_roc_auc
|
|
value: [0.62727273 0.72727273 0.80454545 0.75 0.95 0.63636364
|
|
0.76818182 0.7 0.90454545 0.80454545]
|
|
|
|
mean value: 0.7672727272727273
|
|
|
|
key: train_roc_auc
|
|
value: [0.90968645 0.69148936 0.82631579 0.72631579 0.96842105 0.64361702
|
|
0.93085106 0.69994401 0.88840985 0.78947368]
|
|
|
|
mean value: 0.8074524076147817
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.625 0.63636364 0.5 0.9 0.27272727
|
|
0.58333333 0.64705882 0.83333333 0.71428571]
|
|
|
|
mean value: 0.6212102113572702
|
|
|
|
key: train_jcc
|
|
value: [0.84545455 0.62091503 0.65263158 0.45263158 0.93684211 0.28723404
|
|
0.86170213 0.62 0.78125 0.70149254]
|
|
|
|
mean value: 0.6760153548818377
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12339163 0.15838337 0.13479781 0.10761476 0.11238742 0.11786103
|
|
0.11912537 0.12101555 0.1190908 0.16042662]
|
|
|
|
mean value: 0.12740943431854249
|
|
|
|
key: score_time
|
|
value: [0.01511359 0.02331877 0.01539397 0.01508164 0.01590371 0.01642942
|
|
0.01661968 0.01648045 0.01624656 0.02382159]
|
|
|
|
mean value: 0.017440938949584962
|
|
|
|
key: test_mcc
|
|
value: [0.80909091 0.82275335 0.52727273 0.90829511 1. 0.90909091
|
|
0.71818182 0.82572282 1. 0.90909091]
|
|
|
|
mean value: 0.8429498554008733
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.9047619 0.76190476 0.95238095 1. 0.95238095
|
|
0.85714286 0.9047619 1. 0.95238095]
|
|
|
|
mean value: 0.919047619047619
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.9 0.88888889 0.76190476 0.94736842 1. 0.95238095
|
|
0.85714286 0.9 1. 0.95238095]
|
|
|
|
mean value: 0.9160066833751045
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9 1. 0.72727273 1. 1. 1.
|
|
0.9 1. 1. 1. ]
|
|
|
|
mean value: 0.9527272727272728
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 0.8 0.8 0.9 1. 0.90909091
|
|
0.81818182 0.81818182 1. 0.90909091]
|
|
|
|
mean value: 0.8854545454545455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90454545 0.9 0.76363636 0.95 1. 0.95454545
|
|
0.85909091 0.90909091 1. 0.95454545]
|
|
|
|
mean value: 0.9195454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 0.8 0.61538462 0.9 1. 0.90909091
|
|
0.75 0.81818182 1. 0.90909091]
|
|
|
|
mean value: 0.8519930069930071
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03891611 0.03142166 0.04908037 0.03901887 0.03456759 0.02894402
|
|
0.03260779 0.03758121 0.04022551 0.04435349]
|
|
|
|
mean value: 0.037671661376953124
|
|
|
|
key: score_time
|
|
value: [0.01717496 0.01689577 0.02591538 0.02410626 0.01824427 0.02229643
|
|
0.02497411 0.02035999 0.02795792 0.01740479]
|
|
|
|
mean value: 0.021532988548278807
|
|
|
|
key: test_mcc
|
|
value: [0.74161985 0.90829511 0.71818182 0.82275335 0.90829511 1.
|
|
0.80909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.881732704873914
|
|
|
|
key: train_mcc
|
|
value: [0.97905701 0.97883539 0.98947368 0.98947368 1. 0.98947251
|
|
1. 0.97905237 1. 0.98947251]
|
|
|
|
mean value: 0.9894837157705416
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.95238095 0.85714286 0.9047619 0.95238095 1.
|
|
0.9047619 1. 1. 0.95238095]
|
|
|
|
mean value: 0.9380952380952381
|
|
|
|
key: train_accuracy
|
|
value: [0.98941799 0.98941799 0.99470899 0.99470899 1. 0.99470899
|
|
1. 0.98941799 1. 0.99470899]
|
|
|
|
mean value: 0.9947089947089947
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.94736842 0.85714286 0.88888889 0.94736842 1.
|
|
0.90909091 1. 1. 0.95238095]
|
|
|
|
mean value: 0.9325769861373576
|
|
|
|
key: train_fscore
|
|
value: [0.9893617 0.98947368 0.99470899 0.99470899 1. 0.99465241
|
|
1. 0.98924731 1. 0.99465241]
|
|
|
|
mean value: 0.9946805500418356
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.81818182 1. 1. 1.
|
|
0.90909091 1. 1. 1. ]
|
|
|
|
mean value: 0.9727272727272728
|
|
|
|
key: train_precision
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
key: test_recall
|
|
value: [0.7 0.9 0.9 0.8 0.9 1.
|
|
0.90909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9018181818181819
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.98947368 0.98947368 0.98947368 1. 0.9893617
|
|
1. 0.9787234 1. 0.9893617 ]
|
|
|
|
mean value: 0.990481522956327
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.95 0.85909091 0.9 0.95 1.
|
|
0.90454545 1. 1. 0.95454545]
|
|
|
|
mean value: 0.9368181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [0.98947368 0.98941769 0.99473684 0.99473684 1. 0.99468085
|
|
1. 0.9893617 1. 0.99468085]
|
|
|
|
mean value: 0.9947088465845465
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.9 0.75 0.8 0.9 1.
|
|
0.83333333 1. 1. 0.90909091]
|
|
|
|
mean value: 0.8792424242424243
|
|
|
|
key: train_jcc
|
|
value: [0.97894737 0.97916667 0.98947368 0.98947368 1. 0.9893617
|
|
1. 0.9787234 1. 0.9893617 ]
|
|
|
|
mean value: 0.989450821201941
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06760859 0.46913624 0.26768494 0.08426118 0.09855819 0.09934545
|
|
0.07997847 0.09252858 0.13018012 0.09178209]
|
|
|
|
mean value: 0.14810638427734374
|
|
|
|
key: score_time
|
|
value: [0.02587819 0.01851845 0.02169013 0.02464533 0.02099586 0.02412891
|
|
0.02215743 0.02377272 0.03352404 0.0315032 ]
|
|
|
|
mean value: 0.02468142509460449
|
|
|
|
key: test_mcc
|
|
value: [0.13762047 0.62641448 0.52727273 0.52295779 0.82275335 0.52727273
|
|
0.39196475 0.82572282 0.4719399 0.44038551]
|
|
|
|
mean value: 0.5294304529705056
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.80952381 0.76190476 0.76190476 0.9047619 0.76190476
|
|
0.66666667 0.9047619 0.71428571 0.71428571]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.52631579 0.77777778 0.76190476 0.73684211 0.88888889 0.76190476
|
|
0.58823529 0.9 0.66666667 0.7 ]
|
|
|
|
mean value: 0.7308536045997346
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.875 0.72727273 0.77777778 1. 0.8
|
|
0.83333333 1. 0.85714286 0.77777778]
|
|
|
|
mean value: 0.8203860028860029
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.7 0.8 0.7 0.8 0.72727273
|
|
0.45454545 0.81818182 0.54545455 0.63636364]
|
|
|
|
mean value: 0.6681818181818182
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.56818182 0.80454545 0.76363636 0.75909091 0.9 0.76363636
|
|
0.67727273 0.90909091 0.72272727 0.71818182]
|
|
|
|
mean value: 0.7586363636363637
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.35714286 0.63636364 0.61538462 0.58333333 0.8 0.61538462
|
|
0.41666667 0.81818182 0.5 0.53846154]
|
|
|
|
mean value: 0.5880919080919081
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.35438156 0.39279032 0.34204507 0.33374977 0.37280583 0.38470411
|
|
0.33873773 0.35585856 0.37735271 0.37621069]
|
|
|
|
mean value: 0.3628636360168457
|
|
|
|
key: score_time
|
|
value: [0.01215506 0.01001287 0.00964713 0.00953436 0.01329851 0.00969267
|
|
0.0101738 0.01015115 0.01490641 0.00933409]
|
|
|
|
mean value: 0.010890603065490723
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.90829511 0.82572282 1. 1. 1.
|
|
0.80909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9274953099463279
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.95238095 0.9047619 1. 1. 1.
|
|
0.9047619 1. 1. 0.95238095]
|
|
|
|
mean value: 0.9619047619047619
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.94736842 0.90909091 1. 1. 1.
|
|
0.90909091 1. 1. 0.95238095]
|
|
|
|
mean value: 0.9606820080504291
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.83333333 1. 1. 1.
|
|
0.90909091 1. 1. 1. ]
|
|
|
|
mean value: 0.9742424242424242
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 1. 1. 1. 1.
|
|
0.90909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9518181818181818
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.95 0.90909091 1. 1. 1.
|
|
0.90454545 1. 1. 0.95454545]
|
|
|
|
mean value: 0.9618181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.9 0.83333333 1. 1. 1.
|
|
0.83333333 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9275757575757576
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05870724 0.03189158 0.03561378 0.02327633 0.02330828 0.02847648
|
|
0.0224936 0.02254534 0.03917503 0.02338982]
|
|
|
|
mean value: 0.030887746810913087
|
|
|
|
key: score_time
|
|
value: [0.01973987 0.01692939 0.01304388 0.01248932 0.01487756 0.01249003
|
|
0.01517224 0.01476288 0.0195353 0.01688313]
|
|
|
|
mean value: 0.015592360496520996
|
|
|
|
key: test_mcc
|
|
value: [0.38924947 0.46249729 0.60302269 0.53300179 0.82572282 0.74161985
|
|
0.74161985 0.90829511 0.82275335 0.74161985]
|
|
|
|
mean value: 0.6769402069598355
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.66666667 0.76190476 0.71428571 0.9047619 0.85714286
|
|
0.85714286 0.95238095 0.9047619 0.85714286]
|
|
|
|
mean value: 0.8095238095238095
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.74074074 0.8 0.76923077 0.90909091 0.88
|
|
0.88 0.95652174 0.91666667 0.88 ]
|
|
|
|
mean value: 0.8446536539145235
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.58823529 0.66666667 0.625 0.83333333 0.78571429
|
|
0.78571429 0.91666667 0.84615385 0.78571429]
|
|
|
|
mean value: 0.7388754219636573
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63636364 0.68181818 0.77272727 0.72727273 0.90909091 0.85
|
|
0.85 0.95 0.9 0.85 ]
|
|
|
|
mean value: 0.8127272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.58823529 0.66666667 0.625 0.83333333 0.78571429
|
|
0.78571429 0.91666667 0.84615385 0.78571429]
|
|
|
|
mean value: 0.7388754219636573
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05089164 0.02581358 0.02874446 0.03665304 0.03879476 0.03609467
|
|
0.0341928 0.01681209 0.01631188 0.01475215]
|
|
|
|
mean value: 0.029906105995178223
|
|
|
|
key: score_time
|
|
value: [0.03134346 0.01229501 0.02358603 0.0211916 0.02092743 0.0194459
|
|
0.01217341 0.0122261 0.01211548 0.01222587]
|
|
|
|
mean value: 0.017753028869628908
|
|
|
|
key: test_mcc
|
|
value: [0.71562645 0.71562645 0.33636364 1. 0.80909091 0.63305416
|
|
0.55161872 0.90829511 0.82572282 0.71818182]
|
|
|
|
mean value: 0.7213580077427297
|
|
|
|
key: train_mcc
|
|
value: [0.96830907 0.93736014 0.93670891 0.9264031 0.92597156 0.94714446
|
|
0.95767077 0.95789003 0.94713854 0.93672304]
|
|
|
|
mean value: 0.9441319626080061
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.85714286 0.66666667 1. 0.9047619 0.80952381
|
|
0.76190476 0.95238095 0.9047619 0.85714286]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.96825397 0.96825397 0.96296296 0.96296296 0.97354497
|
|
0.97883598 0.97883598 0.97354497 0.96825397]
|
|
|
|
mean value: 0.9719576719576719
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.84210526 0.66666667 1. 0.9 0.8
|
|
0.73684211 0.95652174 0.9 0.85714286]
|
|
|
|
mean value: 0.8501383894518906
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 0.96774194 0.96875 0.96256684 0.96335079 0.97354497
|
|
0.9787234 0.97894737 0.97326203 0.96842105]
|
|
|
|
mean value: 0.9719435380809441
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 0.63636364 1. 0.9 0.88888889
|
|
0.875 0.91666667 1. 0.9 ]
|
|
|
|
mean value: 0.8894696969696969
|
|
|
|
key: train_precision
|
|
value: [0.9893617 0.98901099 0.95876289 0.97826087 0.95833333 0.96842105
|
|
0.9787234 0.96875 0.97849462 0.95833333]
|
|
|
|
mean value: 0.9726452194511284
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 0.7 1. 0.9 0.72727273
|
|
0.63636364 1. 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8200000000000001
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.94736842 0.97894737 0.94736842 0.96842105 0.9787234
|
|
0.9787234 0.9893617 0.96808511 0.9787234 ]
|
|
|
|
mean value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:148: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:151: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
0.9714669652855543
|
|
|
|
key: test_roc_auc
|
|
value: [0.85454545 0.85454545 0.66818182 1. 0.90454545 0.81363636
|
|
0.76818182 0.95 0.90909091 0.85909091]
|
|
|
|
mean value: 0.8581818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [0.98415454 0.96836506 0.96819709 0.96304591 0.96293393 0.97357223
|
|
0.97883539 0.97889138 0.97351624 0.96830907]
|
|
|
|
mean value: 0.9719820828667414
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.72727273 0.5 1. 0.81818182 0.66666667
|
|
0.58333333 0.91666667 0.81818182 0.75 ]
|
|
|
|
mean value: 0.7507575757575757
|
|
|
|
key: train_jcc
|
|
value: [0.96875 0.9375 0.93939394 0.92783505 0.92929293 0.94845361
|
|
0.95833333 0.95876289 0.94791667 0.93877551]
|
|
|
|
mean value: 0.9455013925282703
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.32369375 0.40332413 0.32787657 0.30810881 0.24554396 0.37028098
|
|
0.35776758 0.43476796 0.61309886 0.40438581]
|
|
|
|
mean value: 0.37888484001159667
|
|
|
|
key: score_time
|
|
value: [0.04028463 0.04569244 0.02587414 0.01976562 0.02221918 0.01939678
|
|
0.02430534 0.04026508 0.05725694 0.02162313]
|
|
|
|
mean value: 0.0316683292388916
|
|
|
|
key: test_mcc
|
|
value: [0.62641448 0.61818182 0.53935989 0.90909091 0.90829511 0.71818182
|
|
0.74795759 1. 0.80909091 0.71818182]
|
|
|
|
mean value: 0.7594754344239546
|
|
|
|
key: train_mcc
|
|
value: [0.98947251 0.95789003 0.95767077 0.94714446 0.95767077 0.94714446
|
|
0.95767077 0.94713854 0.92597984 0.93672304]
|
|
|
|
mean value: 0.9524505200298267
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.80952381 0.76190476 0.95238095 0.95238095 0.85714286
|
|
0.85714286 1. 0.9047619 0.85714286]
|
|
|
|
mean value: 0.8761904761904762
|
|
|
|
key: train_accuracy
|
|
value: [0.99470899 0.97883598 0.97883598 0.97354497 0.97883598 0.97354497
|
|
0.97883598 0.97354497 0.96296296 0.96825397]
|
|
|
|
mean value: 0.9761904761904762
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.8 0.70588235 0.95238095 0.94736842 0.85714286
|
|
0.84210526 1. 0.90909091 0.85714286]
|
|
|
|
mean value: 0.8648891390687057
|
|
|
|
key: train_fscore
|
|
value: [0.9947644 0.9787234 0.97894737 0.97354497 0.97894737 0.97354497
|
|
0.9787234 0.97326203 0.96296296 0.96842105]
|
|
|
|
mean value: 0.9761841938028554
|
|
|
|
key: test_precision
|
|
value: [0.875 0.8 0.85714286 0.90909091 1. 0.9
|
|
1. 1. 0.90909091 0.9 ]
|
|
|
|
mean value: 0.9150324675324675
|
|
|
|
key: train_precision
|
|
value: [0.98958333 0.98924731 0.97894737 0.9787234 0.97894737 0.96842105
|
|
0.9787234 0.97849462 0.95789474 0.95833333]
|
|
|
|
mean value: 0.9757315936976966
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.6 1. 0.9 0.81818182
|
|
0.72727273 1. 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8272727272727273
|
|
|
|
key: train_recall
|
|
value: [1. 0.96842105 0.97894737 0.96842105 0.97894737 0.9787234
|
|
0.9787234 0.96808511 0.96808511 0.9787234 ]
|
|
|
|
mean value: 0.9767077267637179
|
|
|
|
key: test_roc_auc
|
|
value: [0.80454545 0.80909091 0.75454545 0.95454545 0.95 0.85909091
|
|
0.86363636 1. 0.90454545 0.85909091]
|
|
|
|
mean value: 0.8759090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [0.99468085 0.97889138 0.97883539 0.97357223 0.97883539 0.97357223
|
|
0.97883539 0.97351624 0.96298992 0.96830907]
|
|
|
|
mean value: 0.9762038073908175
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.66666667 0.54545455 0.90909091 0.9 0.75
|
|
0.72727273 1. 0.83333333 0.75 ]
|
|
|
|
mean value: 0.7718181818181818
|
|
|
|
key: train_jcc
|
|
value: [0.98958333 0.95833333 0.95876289 0.94845361 0.95876289 0.94845361
|
|
0.95833333 0.94791667 0.92857143 0.93877551]
|
|
|
|
mean value: 0.9535946595132898
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07115984 0.11514401 0.103935 0.06160641 0.0467484 0.14352918
|
|
0.13648367 0.05514574 0.06950617 0.10242081]
|
|
|
|
mean value: 0.09056792259216309
|
|
|
|
key: score_time
|
|
value: [0.02112889 0.01284909 0.02215528 0.02116251 0.02008915 0.0174675
|
|
0.0115118 0.03403592 0.04912376 0.01486492]
|
|
|
|
mean value: 0.022438883781433105
|
|
|
|
key: test_mcc
|
|
value: [ 0.41475753 0.54761905 0.73192505 0.41475753 0.07142857 0.73192505
|
|
0.28288947 0.38575837 0.41475753 -0.23809524]
|
|
|
|
mean value: 0.3757722933220292
|
|
|
|
key: train_mcc
|
|
value: [0.8120433 0.82904734 0.82904734 0.88144164 0.8120433 0.77888301
|
|
0.82904734 0.82958203 0.81310356 0.84732411]
|
|
|
|
mean value: 0.8261562964430988
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.76923077 0.84615385 0.69230769 0.53846154 0.84615385
|
|
0.61538462 0.69230769 0.69230769 0.38461538]
|
|
|
|
mean value: 0.676923076923077
|
|
|
|
key: train_accuracy
|
|
value: [0.90598291 0.91452991 0.91452991 0.94017094 0.90598291 0.88888889
|
|
0.91452991 0.91452991 0.90598291 0.92307692]
|
|
|
|
mean value: 0.9128205128205128
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.85714286 0.71428571 0.5 0.83333333
|
|
0.54545455 0.75 0.66666667 0.42857143]
|
|
|
|
mean value: 0.6778971028971029
|
|
|
|
key: train_fscore
|
|
value: [0.90756303 0.91525424 0.91525424 0.94214876 0.90756303 0.8907563
|
|
0.9137931 0.91525424 0.90756303 0.92436975]
|
|
|
|
mean value: 0.9139519701693681
|
|
|
|
key: test_precision
|
|
value: [0.625 0.71428571 0.75 0.625 0.5 1.
|
|
0.75 0.66666667 0.8 0.42857143]
|
|
|
|
mean value: 0.685952380952381
|
|
|
|
key: train_precision
|
|
value: [0.9 0.91525424 0.91525424 0.91935484 0.9 0.86885246
|
|
0.9137931 0.9 0.8852459 0.90163934]
|
|
|
|
mean value: 0.9019394121652258
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.83333333 1. 0.83333333 0.5 0.71428571
|
|
0.42857143 0.85714286 0.57142857 0.42857143]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.91525424 0.91525424 0.91525424 0.96610169 0.91525424 0.9137931
|
|
0.9137931 0.93103448 0.93103448 0.94827586]
|
|
|
|
mean value: 0.9265049678550555
|
|
|
|
key: test_roc_auc
|
|
value: [0.70238095 0.77380952 0.85714286 0.70238095 0.53571429 0.85714286
|
|
0.63095238 0.67857143 0.70238095 0.38095238]
|
|
|
|
mean value: 0.6821428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.90590298 0.91452367 0.91452367 0.9399474 0.90590298 0.88909994
|
|
0.91452367 0.91466978 0.90619521 0.92329047]
|
|
|
|
mean value: 0.9128579777907656
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.75 0.55555556 0.33333333 0.71428571
|
|
0.375 0.6 0.5 0.27272727]
|
|
|
|
mean value: 0.5281457431457431
|
|
|
|
key: train_jcc
|
|
value: [0.83076923 0.84375 0.84375 0.890625 0.83076923 0.8030303
|
|
0.84126984 0.84375 0.83076923 0.859375 ]
|
|
|
|
mean value: 0.8417857836607837
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.14313459 2.65574384 2.69484901 2.17424965 2.27060723 2.20656586
|
|
2.04485202 2.53665352 2.39897084 2.08672571]
|
|
|
|
mean value: 2.3212352275848387
|
|
|
|
key: score_time
|
|
value: [0.01857519 0.02521801 0.01422095 0.02029347 0.03888059 0.01733184
|
|
0.02459574 0.04596186 0.0188508 0.01879811]
|
|
|
|
mean value: 0.024272656440734862
|
|
|
|
key: test_mcc
|
|
value: [0.41475753 0.54761905 0.54761905 0.41475753 0.59160798 0.85714286
|
|
0.23809524 0.53674504 0.69047619 0.09759001]
|
|
|
|
mean value: 0.49364104686851423
|
|
|
|
key: train_mcc
|
|
value: [0.93218361 1. 1. 0.88144164 1. 0.8974284
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9711053657014244
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.76923077 0.76923077 0.69230769 0.76923077 0.92307692
|
|
0.61538462 0.76923077 0.84615385 0.53846154]
|
|
|
|
mean value: 0.7384615384615385
|
|
|
|
key: train_accuracy
|
|
value: [0.96581197 1. 1. 0.94017094 1. 0.94871795
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9854700854700855
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.76923077 0.71428571 0.66666667 0.92307692
|
|
0.61538462 0.8 0.85714286 0.5 ]
|
|
|
|
mean value: 0.7329304029304029
|
|
|
|
key: train_fscore
|
|
value: [0.96551724 1. 1. 0.94214876 1. 0.94827586
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9855941863778854
|
|
|
|
key: test_precision
|
|
value: [0.625 0.71428571 0.71428571 0.625 1. 1.
|
|
0.66666667 0.75 0.85714286 0.6 ]
|
|
|
|
mean value: 0.7552380952380953
|
|
|
|
key: train_precision
|
|
value: [0.98245614 1. 1. 0.91935484 1. 0.94827586
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.985008684112952
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.83333333 0.83333333 0.83333333 0.5 0.85714286
|
|
0.57142857 0.85714286 0.85714286 0.42857143]
|
|
|
|
mean value: 0.7404761904761905
|
|
|
|
key: train_recall
|
|
value: [0.94915254 1. 1. 0.96610169 1. 0.94827586
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9863530099357101
|
|
|
|
key: test_roc_auc
|
|
value: [0.70238095 0.77380952 0.77380952 0.70238095 0.75 0.92857143
|
|
0.61904762 0.76190476 0.8452381 0.54761905]
|
|
|
|
mean value: 0.7404761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [0.96595558 1. 1. 0.9399474 1. 0.9487142
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9854617182933957
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.625 0.55555556 0.5 0.85714286
|
|
0.44444444 0.66666667 0.75 0.33333333]
|
|
|
|
mean value: 0.5912698412698413
|
|
|
|
key: train_jcc
|
|
value: [0.93333333 1. 1. 0.890625 1. 0.90163934
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9725597677595629
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01936674 0.01057959 0.01035166 0.01044059 0.0103271 0.01035023
|
|
0.01044679 0.01057243 0.01038861 0.01044059]
|
|
|
|
mean value: 0.011326432228088379
|
|
|
|
key: score_time
|
|
value: [0.01023579 0.01007533 0.01012969 0.01011801 0.01018596 0.01004577
|
|
0.0100286 0.01010728 0.01014471 0.01001549]
|
|
|
|
mean value: 0.010108661651611329
|
|
|
|
key: test_mcc
|
|
value: [ 0.09759001 0.23809524 0.38095238 0.09759001 0.38095238 0.53674504
|
|
0.23809524 0.38575837 0.22537447 -0.41475753]
|
|
|
|
mean value: 0.21663956046363453
|
|
|
|
key: train_mcc
|
|
value: [0.62939175 0.56318771 0.50572841 0.54006981 0.56027975 0.52451345
|
|
0.63808526 0.55801254 0.55654161 0.57355974]
|
|
|
|
mean value: 0.5649370021355652
|
|
|
|
key: test_accuracy
|
|
value: [0.53846154 0.61538462 0.69230769 0.53846154 0.69230769 0.76923077
|
|
0.61538462 0.69230769 0.61538462 0.30769231]
|
|
|
|
mean value: 0.6076923076923078
|
|
|
|
key: train_accuracy
|
|
value: [0.81196581 0.77777778 0.75213675 0.76923077 0.77777778 0.76068376
|
|
0.81196581 0.76923077 0.74358974 0.78632479]
|
|
|
|
mean value: 0.7760683760683761
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.61538462 0.66666667 0.57142857 0.66666667 0.8
|
|
0.61538462 0.75 0.70588235 0.4 ]
|
|
|
|
mean value: 0.6362842059900883
|
|
|
|
key: train_fscore
|
|
value: [0.82539683 0.796875 0.76422764 0.7804878 0.79365079 0.7704918
|
|
0.828125 0.79389313 0.79166667 0.78991597]
|
|
|
|
mean value: 0.7934730632304993
|
|
|
|
key: test_precision
|
|
value: [0.5 0.57142857 0.66666667 0.5 0.66666667 0.75
|
|
0.66666667 0.66666667 0.6 0.375 ]
|
|
|
|
mean value: 0.5963095238095237
|
|
|
|
key: train_precision
|
|
value: [0.7761194 0.73913043 0.734375 0.75 0.74626866 0.734375
|
|
0.75714286 0.71232877 0.6627907 0.7704918 ]
|
|
|
|
mean value: 0.7383022619703353
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.66666667 0.66666667 0.66666667 0.66666667 0.85714286
|
|
0.57142857 0.85714286 0.85714286 0.42857143]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: train_recall
|
|
value: [0.88135593 0.86440678 0.79661017 0.81355932 0.84745763 0.81034483
|
|
0.9137931 0.89655172 0.98275862 0.81034483]
|
|
|
|
mean value: 0.8617182933956751
|
|
|
|
key: test_roc_auc
|
|
value: [0.54761905 0.61904762 0.69047619 0.54761905 0.69047619 0.76190476
|
|
0.61904762 0.67857143 0.5952381 0.29761905]
|
|
|
|
mean value: 0.6047619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [0.81136762 0.77703098 0.75175336 0.76884863 0.77717709 0.76110462
|
|
0.81282876 0.77030976 0.7456166 0.78652835]
|
|
|
|
mean value: 0.7762565751022794
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.44444444 0.5 0.4 0.5 0.66666667
|
|
0.44444444 0.6 0.54545455 0.25 ]
|
|
|
|
mean value: 0.4751010101010101
|
|
|
|
key: train_jcc
|
|
value: [0.7027027 0.66233766 0.61842105 0.64 0.65789474 0.62666667
|
|
0.70666667 0.65822785 0.65517241 0.65277778]
|
|
|
|
mean value: 0.6580867527519529
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01078939 0.01074529 0.01073861 0.01068044 0.01072288 0.01063657
|
|
0.01069188 0.01077175 0.01050353 0.01064897]
|
|
|
|
mean value: 0.010692930221557618
|
|
|
|
key: score_time
|
|
value: [0.01027346 0.0101223 0.01015067 0.01009154 0.01016331 0.01005459
|
|
0.01008916 0.01028109 0.01001859 0.01019144]
|
|
|
|
mean value: 0.010143613815307618
|
|
|
|
key: test_mcc
|
|
value: [-0.05143445 0.54761905 0.53674504 -0.07142857 -0.28288947 0.54761905
|
|
0.28288947 0.38095238 0.38095238 -0.38575837]
|
|
|
|
mean value: 0.18852665009433853
|
|
|
|
key: train_mcc
|
|
value: [0.5393392 0.54074089 0.55597781 0.58971362 0.59133581 0.61080452
|
|
0.6087526 0.59794138 0.56027975 0.64168717]
|
|
|
|
mean value: 0.5836572739419282
|
|
|
|
key: test_accuracy
|
|
value: [0.46153846 0.76923077 0.76923077 0.46153846 0.38461538 0.76923077
|
|
0.61538462 0.69230769 0.69230769 0.30769231]
|
|
|
|
mean value: 0.5923076923076923
|
|
|
|
key: train_accuracy
|
|
value: [0.76923077 0.76923077 0.77777778 0.79487179 0.79487179 0.8034188
|
|
0.8034188 0.79487179 0.77777778 0.82051282]
|
|
|
|
mean value: 0.7905982905982906
|
|
|
|
key: test_fscore
|
|
value: [0.53333333 0.76923077 0.72727273 0.46153846 0.2 0.76923077
|
|
0.54545455 0.71428571 0.71428571 0.18181818]
|
|
|
|
mean value: 0.5616450216450216
|
|
|
|
key: train_fscore
|
|
value: [0.76521739 0.76106195 0.77586207 0.79661017 0.78947368 0.78899083
|
|
0.79279279 0.77358491 0.75925926 0.81415929]
|
|
|
|
mean value: 0.7817012336310473
|
|
|
|
key: test_precision
|
|
value: [0.44444444 0.71428571 0.8 0.42857143 0.25 0.83333333
|
|
0.75 0.71428571 0.71428571 0.25 ]
|
|
|
|
mean value: 0.589920634920635
|
|
|
|
key: train_precision
|
|
value: [0.78571429 0.7962963 0.78947368 0.79661017 0.81818182 0.84313725
|
|
0.83018868 0.85416667 0.82 0.83636364]
|
|
|
|
mean value: 0.8170132491071999
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.83333333 0.66666667 0.5 0.16666667 0.71428571
|
|
0.42857143 0.71428571 0.71428571 0.14285714]
|
|
|
|
mean value: 0.5547619047619048
|
|
|
|
key: train_recall
|
|
value: [0.74576271 0.72881356 0.76271186 0.79661017 0.76271186 0.74137931
|
|
0.75862069 0.70689655 0.70689655 0.79310345]
|
|
|
|
mean value: 0.7503506721215664
|
|
|
|
key: test_roc_auc
|
|
value: [0.47619048 0.77380952 0.76190476 0.46428571 0.36904762 0.77380952
|
|
0.63095238 0.69047619 0.69047619 0.32142857]
|
|
|
|
mean value: 0.5952380952380952
|
|
|
|
key: train_roc_auc
|
|
value: [0.76943308 0.76957919 0.77790766 0.79485681 0.79514904 0.80289305
|
|
0.80303916 0.79412624 0.77717709 0.82028054]
|
|
|
|
mean value: 0.7904441846873174
|
|
|
|
key: test_jcc
|
|
value: [0.36363636 0.625 0.57142857 0.3 0.11111111 0.625
|
|
0.375 0.55555556 0.55555556 0.1 ]
|
|
|
|
mean value: 0.4182287157287157
|
|
|
|
key: train_jcc
|
|
value: [0.61971831 0.61428571 0.63380282 0.66197183 0.65217391 0.65151515
|
|
0.65671642 0.63076923 0.6119403 0.68656716]
|
|
|
|
mean value: 0.6419460847957068
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01046824 0.01135015 0.01026702 0.01008916 0.01003551 0.0104847
|
|
0.01012325 0.01084805 0.010113 0.01030326]
|
|
|
|
mean value: 0.010408234596252442
|
|
|
|
key: score_time
|
|
value: [0.01920247 0.0241816 0.01840687 0.02027678 0.01942277 0.01908255
|
|
0.02879333 0.0157578 0.01938295 0.0225172 ]
|
|
|
|
mean value: 0.020702433586120606
|
|
|
|
key: test_mcc
|
|
value: [ 0.38095238 0.38095238 0.05143445 0.54761905 -0.23809524 -0.05143445
|
|
-0.54761905 0.21957752 0.50709255 -0.7200823 ]
|
|
|
|
mean value: 0.053039729323695814
|
|
|
|
key: train_mcc
|
|
value: [0.38893486 0.31846508 0.49235618 0.42340863 0.38583198 0.37313533
|
|
0.47043398 0.38607028 0.39185302 0.39185302]
|
|
|
|
mean value: 0.4022342373708152
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.69230769 0.53846154 0.76923077 0.38461538 0.46153846
|
|
0.23076923 0.61538462 0.69230769 0.15384615]
|
|
|
|
mean value: 0.5230769230769231
|
|
|
|
key: train_accuracy
|
|
value: [0.69230769 0.65811966 0.74358974 0.70940171 0.69230769 0.68376068
|
|
0.73504274 0.69230769 0.69230769 0.69230769]
|
|
|
|
mean value: 0.6991452991452991
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.4 0.76923077 0.33333333 0.36363636
|
|
0.28571429 0.66666667 0.6 0. ]
|
|
|
|
mean value: 0.4751914751914752
|
|
|
|
key: train_fscore
|
|
value: [0.67272727 0.64285714 0.72727273 0.69090909 0.68421053 0.64761905
|
|
0.72566372 0.67272727 0.65384615 0.65384615]
|
|
|
|
mean value: 0.677167910493481
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.66666667 0.5 0.71428571 0.33333333 0.5
|
|
0.28571429 0.625 1. 0. ]
|
|
|
|
mean value: 0.5291666666666667
|
|
|
|
key: train_precision
|
|
value: [0.7254902 0.67924528 0.78431373 0.74509804 0.70909091 0.72340426
|
|
0.74545455 0.71153846 0.73913043 0.73913043]
|
|
|
|
mean value: 0.7301896284771464
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.66666667 0.33333333 0.83333333 0.33333333 0.28571429
|
|
0.28571429 0.71428571 0.42857143 0. ]
|
|
|
|
mean value: 0.45476190476190476
|
|
|
|
key: train_recall
|
|
value: [0.62711864 0.61016949 0.6779661 0.6440678 0.66101695 0.5862069
|
|
0.70689655 0.63793103 0.5862069 0.5862069 ]
|
|
|
|
mean value: 0.6323787258912916
|
|
|
|
key: test_roc_auc
|
|
value: [0.69047619 0.69047619 0.52380952 0.77380952 0.38095238 0.47619048
|
|
0.22619048 0.60714286 0.71428571 0.16666667]
|
|
|
|
mean value: 0.525
|
|
|
|
key: train_roc_auc
|
|
value: [0.69286967 0.65853302 0.74415546 0.70996493 0.69257744 0.68293396
|
|
0.73480421 0.69184687 0.69140853 0.69140853]
|
|
|
|
mean value: 0.6990502630040912
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.25 0.625 0.2 0.22222222
|
|
0.16666667 0.5 0.42857143 0. ]
|
|
|
|
mean value: 0.33924603174603174
|
|
|
|
key: train_jcc
|
|
value: [0.50684932 0.47368421 0.57142857 0.52777778 0.52 0.47887324
|
|
0.56944444 0.50684932 0.48571429 0.48571429]
|
|
|
|
mean value: 0.5126335445179286
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01252365 0.01188517 0.01209474 0.01199675 0.01207352 0.01212645
|
|
0.01216841 0.01206255 0.01212716 0.01190042]
|
|
|
|
mean value: 0.012095880508422852
|
|
|
|
key: score_time
|
|
value: [0.01078773 0.01077604 0.01067948 0.01055169 0.01071 0.01069093
|
|
0.01049042 0.01056886 0.01054859 0.01059175]
|
|
|
|
mean value: 0.010639548301696777
|
|
|
|
key: test_mcc
|
|
value: [ 0.23809524 0.53674504 0.54761905 0.41475753 -0.23809524 0.54761905
|
|
0.14085904 0.53674504 0.41475753 -0.38095238]
|
|
|
|
mean value: 0.2758149898990107
|
|
|
|
key: train_mcc
|
|
value: [0.66472504 0.64361355 0.64361355 0.72698045 0.71044192 0.71177678
|
|
0.79485681 0.67593781 0.693731 0.67743539]
|
|
|
|
mean value: 0.6943112296352275
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.76923077 0.69230769 0.38461538 0.76923077
|
|
0.53846154 0.76923077 0.69230769 0.30769231]
|
|
|
|
mean value: 0.6307692307692307
|
|
|
|
key: train_accuracy
|
|
value: [0.82905983 0.82051282 0.82051282 0.86324786 0.85470085 0.85470085
|
|
0.8974359 0.83760684 0.84615385 0.83760684]
|
|
|
|
mean value: 0.8461538461538461
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.72727273 0.76923077 0.71428571 0.33333333 0.76923077
|
|
0.4 0.8 0.66666667 0.30769231]
|
|
|
|
mean value: 0.6103096903096903
|
|
|
|
key: train_fscore
|
|
value: [0.81818182 0.81415929 0.81415929 0.86206897 0.85217391 0.84684685
|
|
0.89655172 0.83185841 0.83928571 0.82882883]
|
|
|
|
mean value: 0.8404114801992302
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.8 0.71428571 0.625 0.33333333 0.83333333
|
|
0.66666667 0.75 0.8 0.33333333]
|
|
|
|
mean value: 0.6427380952380952
|
|
|
|
key: train_precision
|
|
value: [0.88235294 0.85185185 0.85185185 0.87719298 0.875 0.88679245
|
|
0.89655172 0.85454545 0.87037037 0.86792453]
|
|
|
|
mean value: 0.8714434157522146
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.66666667 0.83333333 0.83333333 0.33333333 0.71428571
|
|
0.28571429 0.85714286 0.57142857 0.28571429]
|
|
|
|
mean value: 0.6047619047619047
|
|
|
|
key: train_recall
|
|
value: [0.76271186 0.77966102 0.77966102 0.84745763 0.83050847 0.81034483
|
|
0.89655172 0.81034483 0.81034483 0.79310345]
|
|
|
|
mean value: 0.8120689655172414
|
|
|
|
key: test_roc_auc
|
|
value: [0.61904762 0.76190476 0.77380952 0.70238095 0.38095238 0.77380952
|
|
0.55952381 0.76190476 0.70238095 0.30952381]
|
|
|
|
mean value: 0.6345238095238096
|
|
|
|
key: train_roc_auc
|
|
value: [0.82963179 0.82086499 0.82086499 0.86338399 0.85490941 0.85432496
|
|
0.8974284 0.8373758 0.84585038 0.83722969]
|
|
|
|
mean value: 0.8461864406779661
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.57142857 0.625 0.55555556 0.2 0.625
|
|
0.25 0.66666667 0.5 0.18181818]
|
|
|
|
mean value: 0.461991341991342
|
|
|
|
key: train_jcc
|
|
value: [0.69230769 0.68656716 0.68656716 0.75757576 0.74242424 0.734375
|
|
0.8125 0.71212121 0.72307692 0.70769231]
|
|
|
|
mean value: 0.7255207463556345
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.92927122 1.5305016 1.36941218 1.77340817 1.98329568 1.87210155
|
|
2.13366818 1.67377758 1.89807558 2.13494849]
|
|
|
|
mean value: 1.7298460245132445
|
|
|
|
key: score_time
|
|
value: [0.02280736 0.01201725 0.01104426 0.0204463 0.02194667 0.02331734
|
|
0.02230024 0.03766036 0.01530027 0.01807332]
|
|
|
|
mean value: 0.02049133777618408
|
|
|
|
key: test_mcc
|
|
value: [ 0.28288947 0.54761905 0.38095238 0.09759001 -0.07142857 0.73192505
|
|
-0.21957752 0.21957752 0.85714286 -0.41475753]
|
|
|
|
mean value: 0.24119327202193427
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.69230769 0.53846154 0.46153846 0.84615385
|
|
0.38461538 0.61538462 0.92307692 0.30769231]
|
|
|
|
mean value: 0.6153846153846154
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.76923077 0.66666667 0.57142857 0.46153846 0.83333333
|
|
0.33333333 0.66666667 0.92307692 0.4 ]
|
|
|
|
mean value: 0.6291941391941391
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.71428571 0.66666667 0.5 0.42857143 1.
|
|
0.4 0.625 1. 0.375 ]
|
|
|
|
mean value: 0.6265079365079365
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.83333333 0.66666667 0.66666667 0.5 0.71428571
|
|
0.28571429 0.71428571 0.85714286 0.42857143]
|
|
|
|
mean value: 0.65
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63095238 0.77380952 0.69047619 0.54761905 0.46428571 0.85714286
|
|
0.39285714 0.60714286 0.92857143 0.29761905]
|
|
|
|
mean value: 0.6190476190476191
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.625 0.5 0.4 0.3 0.71428571
|
|
0.2 0.5 0.85714286 0.25 ]
|
|
|
|
mean value: 0.48464285714285715
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02312112 0.02405119 0.02581382 0.03118682 0.04043317 0.035918
|
|
0.03535628 0.03518629 0.01530671 0.02257276]
|
|
|
|
mean value: 0.028894615173339844
|
|
|
|
key: score_time
|
|
value: [0.02141547 0.03261065 0.03187895 0.02285576 0.02194715 0.02182031
|
|
0.02029324 0.02223802 0.01271105 0.01209235]
|
|
|
|
mean value: 0.02198629379272461
|
|
|
|
key: test_mcc
|
|
value: [0.23809524 0.6172134 0.85391256 0.85714286 0.69047619 1.
|
|
1. 0.85714286 0.85714286 0.73192505]
|
|
|
|
mean value: 0.7703051018389734
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.92307692 0.92307692 0.84615385 1.
|
|
1. 0.92307692 0.92307692 0.84615385]
|
|
|
|
mean value: 0.8769230769230769
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.8 0.90909091 0.92307692 0.83333333 1.
|
|
1. 0.92307692 0.92307692 0.83333333]
|
|
|
|
mean value: 0.876037296037296
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.66666667 1. 0.85714286 0.83333333 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.8928571428571428
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 0.83333333 1. 0.83333333 1.
|
|
1. 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8761904761904762
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61904762 0.78571429 0.91666667 0.92857143 0.8452381 1.
|
|
1. 0.92857143 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8809523809523809
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.66666667 0.83333333 0.85714286 0.71428571 1.
|
|
1. 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.7944444444444444
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08921742 0.08900595 0.09549665 0.12468243 0.12501621 0.12542987
|
|
0.11990476 0.1254077 0.12537408 0.09747744]
|
|
|
|
mean value: 0.11170125007629395
|
|
|
|
key: score_time
|
|
value: [0.01770282 0.01791883 0.0226357 0.02327061 0.02341628 0.02357483
|
|
0.02346158 0.02348065 0.02347827 0.01771784]
|
|
|
|
mean value: 0.021665740013122558
|
|
|
|
key: test_mcc
|
|
value: [ 0.09759001 0.54761905 0.21957752 0.41475753 -0.09759001 1.
|
|
0.14085904 0.54761905 0.6172134 0.09759001]
|
|
|
|
mean value: 0.3585235592252616
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.53846154 0.76923077 0.61538462 0.69230769 0.46153846 1.
|
|
0.53846154 0.76923077 0.76923077 0.53846154]
|
|
|
|
mean value: 0.6692307692307693
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.76923077 0.54545455 0.71428571 0.36363636 1.
|
|
0.4 0.76923077 0.72727273 0.5 ]
|
|
|
|
mean value: 0.636053946053946
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.71428571 0.6 0.625 0.4 1.
|
|
0.66666667 0.83333333 1. 0.6 ]
|
|
|
|
mean value: 0.6939285714285715
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.83333333 0.5 0.83333333 0.33333333 1.
|
|
0.28571429 0.71428571 0.57142857 0.42857143]
|
|
|
|
mean value: 0.6166666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.54761905 0.77380952 0.60714286 0.70238095 0.45238095 1.
|
|
0.55952381 0.77380952 0.78571429 0.54761905]
|
|
|
|
mean value: 0.675
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.625 0.375 0.55555556 0.22222222 1.
|
|
0.25 0.625 0.57142857 0.33333333]
|
|
|
|
mean value: 0.4957539682539682
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00922561 0.01216745 0.00913382 0.00917363 0.00927591 0.00904441
|
|
0.00901008 0.00905704 0.00984097 0.0092206 ]
|
|
|
|
mean value: 0.009514951705932617
|
|
|
|
key: score_time
|
|
value: [0.00917149 0.01156068 0.00884557 0.00909114 0.00899363 0.00888038
|
|
0.00885415 0.00877523 0.00928211 0.00903535]
|
|
|
|
mean value: 0.009248971939086914
|
|
|
|
key: test_mcc
|
|
value: [ 0.28288947 0.38095238 0.05143445 0.28288947 -0.41475753 0.59160798
|
|
0.09759001 0.38095238 0.09759001 -0.53674504]
|
|
|
|
mean value: 0.12144035835279779
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.69230769 0.53846154 0.61538462 0.30769231 0.76923077
|
|
0.53846154 0.69230769 0.53846154 0.23076923]
|
|
|
|
mean value: 0.5538461538461539
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.4 0.66666667 0.18181818 0.82352941
|
|
0.5 0.71428571 0.5 0.16666667]
|
|
|
|
mean value: 0.5286299974535269
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.66666667 0.5 0.55555556 0.2 0.7
|
|
0.6 0.71428571 0.6 0.2 ]
|
|
|
|
mean value: 0.5292063492063492
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.66666667 0.33333333 0.83333333 0.16666667 1.
|
|
0.42857143 0.71428571 0.42857143 0.14285714]
|
|
|
|
mean value: 0.5547619047619048
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63095238 0.69047619 0.52380952 0.63095238 0.29761905 0.75
|
|
0.54761905 0.69047619 0.54761905 0.23809524]
|
|
|
|
mean value: 0.5547619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.25 0.5 0.1 0.7
|
|
0.33333333 0.55555556 0.33333333 0.09090909]
|
|
|
|
mean value: 0.38631313131313133
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.29
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.12453771 1.23042583 1.6623981 1.80660892 1.10945749 1.1398468
|
|
1.11876416 1.10889626 1.09741282 2.23863149]
|
|
|
|
mean value: 1.3636979579925537
|
|
|
|
key: score_time
|
|
value: [0.0901432 0.105124 0.27335095 0.0891552 0.09121609 0.09351397
|
|
0.08874822 0.08856773 0.08960176 0.21340656]
|
|
|
|
mean value: 0.1222827672958374
|
|
|
|
key: test_mcc
|
|
value: [0.38095238 0.6172134 0.53674504 0.41475753 0.38575837 1.
|
|
0.39477102 0.73192505 0.28288947 0.28288947]
|
|
|
|
mean value: 0.5027901748379063
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.76923077 0.76923077 0.69230769 0.69230769 1.
|
|
0.61538462 0.84615385 0.61538462 0.61538462]
|
|
|
|
mean value: 0.7307692307692308
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.8 0.72727273 0.71428571 0.6 1.
|
|
0.44444444 0.83333333 0.54545455 0.54545455]
|
|
|
|
mean value: 0.6876911976911977
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.66666667 0.8 0.625 0.75 1.
|
|
1. 1. 0.75 0.75 ]
|
|
|
|
mean value: 0.8008333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 0.66666667 0.83333333 0.5 1.
|
|
0.28571429 0.71428571 0.42857143 0.42857143]
|
|
|
|
mean value: 0.6523809523809524
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.69047619 0.78571429 0.76190476 0.70238095 0.67857143 1.
|
|
0.64285714 0.85714286 0.63095238 0.63095238]
|
|
|
|
mean value: 0.7380952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.5 0.66666667 0.57142857 0.55555556 0.42857143 1.
|
|
0.28571429 0.71428571 0.375 0.375 ]
|
|
|
|
mean value: 0.5472222222222222
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.01023936 1.5312047 1.59749556 1.66958928 1.76442599 1.60279465
|
|
1.98377275 1.63427329 1.65822673 1.49137998]
|
|
|
|
mean value: 1.6943402290344238
|
|
|
|
key: score_time
|
|
value: [0.21109867 0.20521808 0.27385473 0.18928909 0.19972277 0.15325022
|
|
0.16408372 0.16386247 0.17173505 0.19090343]
|
|
|
|
mean value: 0.1923018217086792
|
|
|
|
key: test_mcc
|
|
value: [0.23809524 0.6172134 0.69047619 0.6172134 0.38095238 1.
|
|
0.39477102 0.85391256 0.41475753 0.09759001]
|
|
|
|
mean value: 0.5304981728324353
|
|
|
|
key: train_mcc
|
|
value: [0.94994292 0.94994292 0.96636481 0.98304594 0.96636481 0.93384219
|
|
0.98305085 0.94998574 0.93161894 0.94998574]
|
|
|
|
mean value: 0.9564144838510643
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.84615385 0.76923077 0.69230769 1.
|
|
0.61538462 0.92307692 0.69230769 0.53846154]
|
|
|
|
mean value: 0.7461538461538462
|
|
|
|
key: train_accuracy
|
|
value: [0.97435897 0.97435897 0.98290598 0.99145299 0.98290598 0.96581197
|
|
0.99145299 0.97435897 0.96581197 0.97435897]
|
|
|
|
mean value: 0.9777777777777777
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.8 0.83333333 0.8 0.66666667 1.
|
|
0.44444444 0.93333333 0.66666667 0.5 ]
|
|
|
|
mean value: 0.7259829059829059
|
|
|
|
key: train_fscore
|
|
value: [0.97520661 0.97520661 0.98333333 0.99159664 0.98333333 0.96666667
|
|
0.99145299 0.97478992 0.96551724 0.97478992]
|
|
|
|
mean value: 0.9781893259894366
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.66666667 0.83333333 0.66666667 0.66666667 1.
|
|
1. 0.875 0.8 0.6 ]
|
|
|
|
mean value: 0.7679761904761905
|
|
|
|
key: train_precision
|
|
value: [0.9516129 0.9516129 0.96721311 0.98333333 0.96721311 0.93548387
|
|
0.98305085 0.95081967 0.96551724 0.95081967]
|
|
|
|
mean value: 0.9606676673360117
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 0.83333333 1. 0.66666667 1.
|
|
0.28571429 1. 0.57142857 0.42857143]
|
|
|
|
mean value: 0.7452380952380953
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.96551724 1. ]
|
|
|
|
mean value: 0.996551724137931
|
|
|
|
key: test_roc_auc
|
|
value: [0.61904762 0.78571429 0.8452381 0.78571429 0.69047619 1.
|
|
0.64285714 0.91666667 0.70238095 0.54761905]
|
|
|
|
mean value: 0.7535714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.97413793 0.97413793 0.98275862 0.99137931 0.98275862 0.96610169
|
|
0.99152542 0.97457627 0.96580947 0.97457627]
|
|
|
|
mean value: 0.9777761542957335
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.66666667 0.71428571 0.66666667 0.5 1.
|
|
0.28571429 0.875 0.5 0.33333333]
|
|
|
|
mean value: 0.5986111111111111
|
|
|
|
key: train_jcc
|
|
value: [0.9516129 0.9516129 0.96721311 0.98333333 0.96721311 0.93548387
|
|
0.98305085 0.95081967 0.93333333 0.95081967]
|
|
|
|
mean value: 0.957449276531414
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0101862 0.00909281 0.0090611 0.0091393 0.00983238 0.00917673
|
|
0.0095005 0.0091691 0.00907755 0.00950789]
|
|
|
|
mean value: 0.009374356269836426
|
|
|
|
key: score_time
|
|
value: [0.00923538 0.00892949 0.00899577 0.0089128 0.00890374 0.00899363
|
|
0.00957346 0.00951457 0.00898981 0.00949144]
|
|
|
|
mean value: 0.009154009819030761
|
|
|
|
key: test_mcc
|
|
value: [-0.05143445 0.54761905 0.53674504 -0.07142857 -0.28288947 0.54761905
|
|
0.28288947 0.38095238 0.38095238 -0.38575837]
|
|
|
|
mean value: 0.18852665009433853
|
|
|
|
key: train_mcc
|
|
value: [0.5393392 0.54074089 0.55597781 0.58971362 0.59133581 0.61080452
|
|
0.6087526 0.59794138 0.56027975 0.64168717]
|
|
|
|
mean value: 0.5836572739419282
|
|
|
|
key: test_accuracy
|
|
value: [0.46153846 0.76923077 0.76923077 0.46153846 0.38461538 0.76923077
|
|
0.61538462 0.69230769 0.69230769 0.30769231]
|
|
|
|
mean value: 0.5923076923076923
|
|
|
|
key: train_accuracy
|
|
value: [0.76923077 0.76923077 0.77777778 0.79487179 0.79487179 0.8034188
|
|
0.8034188 0.79487179 0.77777778 0.82051282]
|
|
|
|
mean value: 0.7905982905982906
|
|
|
|
key: test_fscore
|
|
value: [0.53333333 0.76923077 0.72727273 0.46153846 0.2 0.76923077
|
|
0.54545455 0.71428571 0.71428571 0.18181818]
|
|
|
|
mean value: 0.5616450216450216
|
|
|
|
key: train_fscore
|
|
value: [0.76521739 0.76106195 0.77586207 0.79661017 0.78947368 0.78899083
|
|
0.79279279 0.77358491 0.75925926 0.81415929]
|
|
|
|
mean value: 0.7817012336310473
|
|
|
|
key: test_precision
|
|
value: [0.44444444 0.71428571 0.8 0.42857143 0.25 0.83333333
|
|
0.75 0.71428571 0.71428571 0.25 ]
|
|
|
|
mean value: 0.589920634920635
|
|
|
|
key: train_precision
|
|
value: [0.78571429 0.7962963 0.78947368 0.79661017 0.81818182 0.84313725
|
|
0.83018868 0.85416667 0.82 0.83636364]
|
|
|
|
mean value: 0.8170132491071999
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.83333333 0.66666667 0.5 0.16666667 0.71428571
|
|
0.42857143 0.71428571 0.71428571 0.14285714]
|
|
|
|
mean value: 0.5547619047619048
|
|
|
|
key: train_recall
|
|
value: [0.74576271 0.72881356 0.76271186 0.79661017 0.76271186 0.74137931
|
|
0.75862069 0.70689655 0.70689655 0.79310345]
|
|
|
|
mean value: 0.7503506721215664
|
|
|
|
key: test_roc_auc
|
|
value: [0.47619048 0.77380952 0.76190476 0.46428571 0.36904762 0.77380952
|
|
0.63095238 0.69047619 0.69047619 0.32142857]
|
|
|
|
mean value: 0.5952380952380952
|
|
|
|
key: train_roc_auc
|
|
value: [0.76943308 0.76957919 0.77790766 0.79485681 0.79514904 0.80289305
|
|
0.80303916 0.79412624 0.77717709 0.82028054]
|
|
|
|
mean value: 0.7904441846873174
|
|
|
|
key: test_jcc
|
|
value: [0.36363636 0.625 0.57142857 0.3 0.11111111 0.625
|
|
0.375 0.55555556 0.55555556 0.1 ]
|
|
|
|
mean value: 0.4182287157287157
|
|
|
|
key: train_jcc
|
|
value: [0.61971831 0.61428571 0.63380282 0.66197183 0.65217391 0.65151515
|
|
0.65671642 0.63076923 0.6119403 0.68656716]
|
|
|
|
mean value: 0.6419460847957068
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [3.79892635 4.06916142 3.78275681 3.70944476 3.7265656 3.06774664
|
|
1.29508638 1.30997825 1.30333805 1.31682229]
|
|
|
|
mean value: 2.737982654571533
|
|
|
|
key: score_time
|
|
value: [0.03303289 0.02331901 0.03021288 0.01799679 0.02469015 0.01262164
|
|
0.01303506 0.01236606 0.0128386 0.01293612]
|
|
|
|
mean value: 0.019304919242858886
|
|
|
|
key: test_mcc
|
|
value: [0.23809524 0.73192505 1. 0.73192505 0.85714286 1.
|
|
0.73192505 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.8148156116515152
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.84615385 1. 0.84615385 0.92307692 1.
|
|
0.84615385 1. 0.92307692 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.85714286 1. 0.85714286 0.92307692 1.
|
|
0.83333333 1. 0.92307692 1. ]
|
|
|
|
mean value: 0.9009157509157508
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.75 1. 0.75 0.85714286 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.8928571428571428
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 1. 1. 1. 1.
|
|
0.71428571 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9238095238095239
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61904762 0.85714286 1. 0.85714286 0.92857143 1.
|
|
0.85714286 1. 0.92857143 1. ]
|
|
|
|
mean value: 0.9047619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.75 1. 0.75 0.85714286 1.
|
|
0.71428571 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.8373015873015873
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02560902 0.0465641 0.0467689 0.04626966 0.04703498 0.03659153
|
|
0.07281756 0.05043769 0.05987668 0.03430367]
|
|
|
|
mean value: 0.046627378463745116
|
|
|
|
key: score_time
|
|
value: [0.02470279 0.02194142 0.02393842 0.02235103 0.01229 0.0124898
|
|
0.02689171 0.02636933 0.01218748 0.01236296]
|
|
|
|
mean value: 0.01955249309539795
|
|
|
|
key: test_mcc
|
|
value: [ 0.28288947 0.23809524 -0.07142857 -0.09759001 0.23809524 -0.05143445
|
|
0.28288947 0.21957752 0.23809524 0.07142857]
|
|
|
|
mean value: 0.13506177232779207
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.61538462 0.46153846 0.46153846 0.61538462 0.46153846
|
|
0.61538462 0.61538462 0.61538462 0.53846154]
|
|
|
|
mean value: 0.5615384615384615
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.61538462 0.46153846 0.36363636 0.61538462 0.36363636
|
|
0.54545455 0.66666667 0.61538462 0.57142857]
|
|
|
|
mean value: 0.5485181485181485
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.57142857 0.42857143 0.4 0.57142857 0.5
|
|
0.75 0.625 0.66666667 0.57142857]
|
|
|
|
mean value: 0.5640079365079365
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.66666667 0.5 0.33333333 0.66666667 0.28571429
|
|
0.42857143 0.71428571 0.57142857 0.57142857]
|
|
|
|
mean value: 0.5571428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63095238 0.61904762 0.46428571 0.45238095 0.61904762 0.47619048
|
|
0.63095238 0.60714286 0.61904762 0.53571429]
|
|
|
|
mean value: 0.5654761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.44444444 0.3 0.22222222 0.44444444 0.22222222
|
|
0.375 0.5 0.44444444 0.4 ]
|
|
|
|
mean value: 0.3852777777777778
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.03235841 0.00919366 0.00935245 0.0087781 0.00871897 0.00909734
|
|
0.00957131 0.0088644 0.00907183 0.00900698]
|
|
|
|
mean value: 0.01140134334564209
|
|
|
|
key: score_time
|
|
value: [0.01621103 0.00943017 0.00886726 0.00846529 0.00861454 0.00898194
|
|
0.00923371 0.00882483 0.0093236 0.00886536]
|
|
|
|
mean value: 0.009681773185729981
|
|
|
|
key: test_mcc
|
|
value: [ 0.41475753 0.23809524 0.69047619 0.09759001 -0.23809524 0.38095238
|
|
-0.07142857 0.38095238 0.54761905 -0.23809524]
|
|
|
|
mean value: 0.22028237287741703
|
|
|
|
key: train_mcc
|
|
value: [0.45433325 0.43583749 0.50423855 0.45295149 0.52149771 0.41876096
|
|
0.50511865 0.45433325 0.48858389 0.47019287]
|
|
|
|
mean value: 0.47058481235241056
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.61538462 0.84615385 0.53846154 0.38461538 0.69230769
|
|
0.46153846 0.69230769 0.76923077 0.38461538]
|
|
|
|
mean value: 0.6076923076923078
|
|
|
|
key: train_accuracy
|
|
value: [0.72649573 0.71794872 0.75213675 0.72649573 0.76068376 0.70940171
|
|
0.75213675 0.72649573 0.74358974 0.73504274]
|
|
|
|
mean value: 0.7350427350427351
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.61538462 0.83333333 0.57142857 0.33333333 0.71428571
|
|
0.46153846 0.71428571 0.76923077 0.42857143]
|
|
|
|
mean value: 0.6155677655677656
|
|
|
|
key: train_fscore
|
|
value: [0.71929825 0.72268908 0.75630252 0.72881356 0.76666667 0.70689655
|
|
0.75630252 0.73333333 0.75 0.73504274]
|
|
|
|
mean value: 0.737534520935
|
|
|
|
key: test_precision
|
|
value: [0.625 0.57142857 0.83333333 0.5 0.33333333 0.71428571
|
|
0.5 0.71428571 0.83333333 0.42857143]
|
|
|
|
mean value: 0.6053571428571428
|
|
|
|
key: train_precision
|
|
value: [0.74545455 0.71666667 0.75 0.72881356 0.75409836 0.70689655
|
|
0.73770492 0.70967742 0.72580645 0.72881356]
|
|
|
|
mean value: 0.7303932032145685
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.66666667 0.83333333 0.66666667 0.33333333 0.71428571
|
|
0.42857143 0.71428571 0.71428571 0.42857143]
|
|
|
|
mean value: 0.6333333333333333
|
|
|
|
key: train_recall
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[0.69491525 0.72881356 0.76271186 0.72881356 0.77966102 0.70689655
|
|
0.77586207 0.75862069 0.77586207 0.74137931]
|
|
|
|
mean value: 0.7453535943892461
|
|
|
|
key: test_roc_auc
|
|
value: [0.70238095 0.61904762 0.8452381 0.54761905 0.38095238 0.69047619
|
|
0.46428571 0.69047619 0.77380952 0.38095238]
|
|
|
|
mean value: 0.6095238095238096
|
|
|
|
key: train_roc_auc
|
|
value: [0.72676797 0.71785506 0.75204559 0.72647575 0.76052016 0.70938048
|
|
0.75233781 0.72676797 0.74386324 0.73509643]
|
|
|
|
mean value: 0.7351110461718293
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.44444444 0.71428571 0.4 0.2 0.55555556
|
|
0.3 0.55555556 0.625 0.27272727]
|
|
|
|
mean value: 0.46231240981240984
|
|
|
|
key: train_jcc
|
|
value: [0.56164384 0.56578947 0.60810811 0.57333333 0.62162162 0.54666667
|
|
0.60810811 0.57894737 0.6 0.58108108]
|
|
|
|
mean value: 0.5845299596640621
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01001859 0.01448417 0.01453733 0.01442242 0.01421142 0.01466846
|
|
0.01452231 0.01492524 0.03460717 0.01412797]
|
|
|
|
mean value: 0.016052508354187013
|
|
|
|
key: score_time
|
|
value: [0.00898385 0.01171184 0.01143146 0.01166272 0.01168752 0.01167083
|
|
0.01222038 0.01173306 0.01181865 0.01167536]
|
|
|
|
mean value: 0.011459565162658692
|
|
|
|
key: test_mcc
|
|
value: [ 0.39477102 0.59160798 0.54761905 -0.09759001 0.41475753 0.85714286
|
|
0. 0.41475753 0.39477102 -0.05143445]
|
|
|
|
mean value: 0.3466402521747625
|
|
|
|
key: train_mcc
|
|
value: [0.5256472 0.70108874 0.96580947 0.82695916 0.79157144 0.8524126
|
|
0.55242412 0.75475504 0.4300616 0.93214426]
|
|
|
|
mean value: 0.7332873650973463
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.76923077 0.46153846 0.69230769 0.92307692
|
|
0.46153846 0.69230769 0.61538462 0.46153846]
|
|
|
|
mean value: 0.6461538461538462
|
|
|
|
key: train_accuracy
|
|
value: [0.72649573 0.82905983 0.98290598 0.90598291 0.88888889 0.92307692
|
|
0.73504274 0.86324786 0.65811966 0.96581197]
|
|
|
|
mean value: 0.8478632478632478
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.66666667 0.76923077 0.36363636 0.71428571 0.92307692
|
|
0. 0.66666667 0.44444444 0.36363636]
|
|
|
|
mean value: 0.5617526264585088
|
|
|
|
key: train_fscore
|
|
value: [0.78378378 0.79591837 0.98305085 0.89719626 0.89922481 0.92682927
|
|
0.63529412 0.84 0.47368421 0.96491228]
|
|
|
|
mean value: 0.8199893943639955
|
|
|
|
key: test_precision
|
|
value: [0.54545455 1. 0.71428571 0.4 0.625 1.
|
|
0. 0.8 1. 0.5 ]
|
|
|
|
mean value: 0.6584740259740259
|
|
|
|
key: train_precision
|
|
value: [0.65168539 1. 0.98305085 1. 0.82857143 0.87692308
|
|
1. 1. 1. 0.98214286]
|
|
|
|
mean value: 0.9322373603353417
|
|
|
|
key: test_recall
|
|
value: [1. 0.5 0.83333333 0.33333333 0.83333333 0.85714286
|
|
0. 0.57142857 0.28571429 0.28571429]
|
|
|
|
mean value: 0.55
|
|
|
|
key: train_recall
|
|
value: [0.98305085 0.66101695 0.98305085 0.81355932 0.98305085 0.98275862
|
|
0.46551724 0.72413793 0.31034483 0.94827586]
|
|
|
|
mean value: 0.7854763296317943
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.75 0.77380952 0.45238095 0.70238095 0.92857143
|
|
0.5 0.70238095 0.64285714 0.47619048]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.72428404 0.83050847 0.98290473 0.90677966 0.88807715 0.9235827
|
|
0.73275862 0.86206897 0.65517241 0.96566335]
|
|
|
|
mean value: 0.8471800116890708
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.5 0.625 0.22222222 0.55555556 0.85714286
|
|
0. 0.5 0.28571429 0.22222222]
|
|
|
|
mean value: 0.43133116883116884
|
|
|
|
key: train_jcc
|
|
value: [0.64444444 0.66101695 0.96666667 0.81355932 0.81690141 0.86363636
|
|
0.46551724 0.72413793 0.31034483 0.93220339]
|
|
|
|
mean value: 0.7198428544215129
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01358604 0.01421356 0.0135591 0.01319122 0.01309133 0.01321769
|
|
0.01343799 0.01352835 0.03366351 0.013412 ]
|
|
|
|
mean value: 0.015490078926086425
|
|
|
|
key: score_time
|
|
value: [0.01017308 0.01169968 0.0117135 0.0116725 0.01163697 0.01164961
|
|
0.0117476 0.01171494 0.01705313 0.01171875]
|
|
|
|
mean value: 0.012077975273132324
|
|
|
|
key: test_mcc
|
|
value: [0.03289758 0.54761905 0.7200823 0.23809524 0.23809524 0.54761905
|
|
0.09759001 0.41475753 0.54761905 0.05143445]
|
|
|
|
mean value: 0.3435809491904047
|
|
|
|
key: train_mcc
|
|
value: [0.70108874 0.96580947 0.79806402 0.79924461 0.76221784 0.8120433
|
|
0.88348376 0.82644112 0.83358601 0.75745182]
|
|
|
|
mean value: 0.8139430690182574
|
|
|
|
key: test_accuracy
|
|
value: [0.53846154 0.76923077 0.84615385 0.61538462 0.61538462 0.76923077
|
|
0.53846154 0.69230769 0.76923077 0.53846154]
|
|
|
|
mean value: 0.6692307692307693
|
|
|
|
key: train_accuracy
|
|
value: [0.82905983 0.98290598 0.88888889 0.8974359 0.87179487 0.90598291
|
|
0.94017094 0.90598291 0.91452991 0.87179487]
|
|
|
|
mean value: 0.9008547008547009
|
|
|
|
key: test_fscore
|
|
value: [0.25 0.76923077 0.8 0.61538462 0.61538462 0.76923077
|
|
0.5 0.66666667 0.76923077 0.625 ]
|
|
|
|
mean value: 0.6380128205128205
|
|
|
|
key: train_fscore
|
|
value: [0.79591837 0.98305085 0.87619048 0.89285714 0.88549618 0.90434783
|
|
0.93693694 0.8952381 0.91803279 0.88188976]
|
|
|
|
mean value: 0.8969958425985054
|
|
|
|
key: test_precision
|
|
value: [0.5 0.71428571 1. 0.57142857 0.57142857 0.83333333
|
|
0.6 0.8 0.83333333 0.55555556]
|
|
|
|
mean value: 0.697936507936508
|
|
|
|
key: train_precision
|
|
value: [1. 0.98305085 1. 0.94339623 0.80555556 0.9122807
|
|
0.98113208 1. 0.875 0.8115942 ]
|
|
|
|
mean value: 0.9312009609552911
|
|
|
|
key: test_recall
|
|
value: [0.16666667 0.83333333 0.66666667 0.66666667 0.66666667 0.71428571
|
|
0.42857143 0.57142857 0.71428571 0.71428571]
|
|
|
|
mean value: 0.6142857142857143
|
|
|
|
key: train_recall
|
|
value: [0.66101695 0.98305085 0.77966102 0.84745763 0.98305085 0.89655172
|
|
0.89655172 0.81034483 0.96551724 0.96551724]
|
|
|
|
mean value: 0.8788720046756283
|
|
|
|
key: test_roc_auc
|
|
value: [0.51190476 0.77380952 0.83333333 0.61904762 0.61904762 0.77380952
|
|
0.54761905 0.70238095 0.77380952 0.52380952]
|
|
|
|
mean value: 0.6678571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.83050847 0.98290473 0.88983051 0.89786674 0.87083577 0.90590298
|
|
0.93980129 0.90517241 0.91496201 0.87258913]
|
|
|
|
mean value: 0.9010374050263005
|
|
|
|
key: test_jcc
|
|
value: [0.14285714 0.625 0.66666667 0.44444444 0.44444444 0.625
|
|
0.33333333 0.5 0.625 0.45454545]
|
|
|
|
mean value: 0.48612914862914863
|
|
|
|
key: train_jcc
|
|
value: [0.66101695 0.96666667 0.77966102 0.80645161 0.79452055 0.82539683
|
|
0.88135593 0.81034483 0.84848485 0.78873239]
|
|
|
|
mean value: 0.816263162165426
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10403562 0.09558034 0.09651375 0.09247088 0.09357142 0.09638667
|
|
0.09789968 0.09478498 0.09305978 0.0943675 ]
|
|
|
|
mean value: 0.09586706161499023
|
|
|
|
key: score_time
|
|
value: [0.01546073 0.015697 0.01498866 0.01505232 0.01517606 0.01585627
|
|
0.01545739 0.01489782 0.01497364 0.01499128]
|
|
|
|
mean value: 0.015255117416381836
|
|
|
|
key: test_mcc
|
|
value: [0.73192505 0.85714286 0.85391256 0.73192505 0.85714286 1.
|
|
0.73192505 0.69047619 0.85391256 0.85714286]
|
|
|
|
mean value: 0.8165505053698895
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.84615385 0.92307692 0.92307692 0.84615385 0.92307692 1.
|
|
0.84615385 0.84615385 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.92307692 0.90909091 0.85714286 0.92307692 1.
|
|
0.83333333 0.85714286 0.93333333 0.92307692]
|
|
|
|
mean value: 0.9016416916416916
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.85714286 1. 0.75 0.85714286 1.
|
|
1. 0.85714286 0.875 1. ]
|
|
|
|
mean value: 0.8946428571428571
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.83333333 1. 1. 1.
|
|
0.71428571 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.9261904761904762
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.92857143 0.91666667 0.85714286 0.92857143 1.
|
|
0.85714286 0.8452381 0.91666667 0.92857143]
|
|
|
|
mean value: 0.9035714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.85714286 0.83333333 0.75 0.85714286 1.
|
|
0.71428571 0.75 0.875 0.85714286]
|
|
|
|
mean value: 0.8244047619047619
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03330159 0.04289031 0.03018212 0.03177643 0.03046227 0.03062987
|
|
0.03917623 0.05034637 0.03564668 0.03854513]
|
|
|
|
mean value: 0.03629570007324219
|
|
|
|
key: score_time
|
|
value: [0.02010465 0.01803446 0.02383947 0.01709294 0.01847744 0.01818728
|
|
0.03815055 0.02306819 0.02303863 0.02940965]
|
|
|
|
mean value: 0.02294032573699951
|
|
|
|
key: test_mcc
|
|
value: [0.38095238 0.73192505 0.85391256 0.54761905 0.85714286 0.85391256
|
|
0.73192505 1. 0.85391256 0.73192505]
|
|
|
|
mean value: 0.7543227141338386
|
|
|
|
key: train_mcc
|
|
value: [0.96580947 1. 1. 0.96580947 1. 0.96638414
|
|
1. 0.96638414 0.96638414 0.96580947]
|
|
|
|
mean value: 0.9796580822953629
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.84615385 0.92307692 0.76923077 0.92307692 0.92307692
|
|
0.84615385 1. 0.92307692 0.84615385]
|
|
|
|
mean value: 0.8692307692307693
|
|
|
|
key: train_accuracy
|
|
value: [0.98290598 1. 1. 0.98290598 1. 0.98290598
|
|
1. 0.98290598 0.98290598 0.98290598]
|
|
|
|
mean value: 0.9897435897435897
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.85714286 0.90909091 0.76923077 0.92307692 0.93333333
|
|
0.83333333 1. 0.93333333 0.83333333]
|
|
|
|
mean value: 0.8658541458541458
|
|
|
|
key: train_fscore
|
|
value: [0.98305085 1. 1. 0.98305085 1. 0.98305085
|
|
1. 0.98305085 0.98305085 0.98275862]
|
|
|
|
mean value: 0.989801285797779
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.75 1. 0.71428571 0.85714286 0.875
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.8738095238095238
|
|
|
|
key: train_precision
|
|
value: [0.98305085 1. 1. 0.98305085 1. 0.96666667
|
|
1. 0.96666667 0.96666667 0.98275862]
|
|
|
|
mean value: 0.9848860315604909
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 0.83333333 0.83333333 1. 1.
|
|
0.71428571 1. 1. 0.71428571]
|
|
|
|
mean value: 0.8761904761904762
|
|
|
|
key: train_recall
|
|
value: [0.98305085 1. 1. 0.98305085 1. 1.
|
|
1. 1. 1. 0.98275862]
|
|
|
|
mean value: 0.9948860315604909
|
|
|
|
key: test_roc_auc
|
|
value: [0.69047619 0.85714286 0.91666667 0.77380952 0.92857143 0.91666667
|
|
0.85714286 1. 0.91666667 0.85714286]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.98290473 1. 1. 0.98290473 1. 0.98305085
|
|
1. 0.98305085 0.98305085 0.98290473]
|
|
|
|
mean value: 0.9897866744593804
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.75 0.83333333 0.625 0.85714286 0.875
|
|
0.71428571 1. 0.875 0.71428571]
|
|
|
|
mean value: 0.7744047619047619
|
|
|
|
key: train_jcc
|
|
value: [0.96666667 1. 1. 0.96666667 1. 0.96666667
|
|
1. 0.96666667 0.96666667 0.96610169]
|
|
|
|
mean value: 0.9799435028248588
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03159785 0.0388267 0.0496254 0.0527494 0.06397605 0.05230594
|
|
0.03799987 0.04411817 0.04123425 0.05326414]
|
|
|
|
mean value: 0.04656977653503418
|
|
|
|
key: score_time
|
|
value: [0.0215857 0.03370619 0.04122758 0.03210497 0.02467465 0.02374005
|
|
0.0220952 0.02108288 0.02432179 0.02440858]
|
|
|
|
mean value: 0.026894760131835938
|
|
|
|
key: test_mcc
|
|
value: [ 0.23809524 0.53674504 0.07142857 0.21957752 -0.54761905 0.54761905
|
|
0.14085904 0.09759001 0.50709255 -0.23809524]
|
|
|
|
mean value: 0.15732927305504008
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61538462 0.76923077 0.53846154 0.61538462 0.23076923 0.76923077
|
|
0.53846154 0.53846154 0.69230769 0.38461538]
|
|
|
|
mean value: 0.5692307692307692
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.72727273 0.5 0.54545455 0.16666667 0.76923077
|
|
0.4 0.5 0.6 0.42857143]
|
|
|
|
mean value: 0.5252580752580752
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.8 0.5 0.6 0.16666667 0.83333333
|
|
0.66666667 0.6 1. 0.42857143]
|
|
|
|
mean value: 0.6166666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.66666667 0.5 0.5 0.16666667 0.71428571
|
|
0.28571429 0.42857143 0.42857143 0.42857143]
|
|
|
|
mean value: 0.47857142857142854
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61904762 0.76190476 0.53571429 0.60714286 0.22619048 0.77380952
|
|
0.55952381 0.54761905 0.71428571 0.38095238]
|
|
|
|
mean value: 0.5726190476190476
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.57142857 0.33333333 0.375 0.09090909 0.625
|
|
0.25 0.33333333 0.42857143 0.27272727]
|
|
|
|
mean value: 0.37247474747474746
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.27359462 0.25197363 0.24951911 0.24853635 0.2603178 0.25177169
|
|
0.26295996 0.25529766 0.24311304 0.25735235]
|
|
|
|
mean value: 0.2554436206817627
|
|
|
|
key: score_time
|
|
value: [0.00972319 0.00941014 0.00939226 0.00937867 0.00994706 0.00936031
|
|
0.00973701 0.00932145 0.0103333 0.00949168]
|
|
|
|
mean value: 0.009609508514404296
|
|
|
|
key: test_mcc
|
|
value: [0.73192505 0.73192505 0.85391256 0.73192505 0.85714286 1.
|
|
0.73192505 1. 0.85714286 0.73192505]
|
|
|
|
mean value: 0.822782355167268
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.84615385 0.84615385 0.92307692 0.84615385 0.92307692 1.
|
|
0.84615385 1. 0.92307692 0.84615385]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.85714286 0.90909091 0.85714286 0.92307692 1.
|
|
0.83333333 1. 0.92307692 0.83333333]
|
|
|
|
mean value: 0.8993339993339993
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75 1. 0.75 0.85714286 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9107142857142857
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.83333333 1. 1. 1.
|
|
0.71428571 1. 0.85714286 0.71428571]
|
|
|
|
mean value: 0.9119047619047619
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.85714286 0.91666667 0.85714286 0.92857143 1.
|
|
0.85714286 1. 0.92857143 0.85714286]
|
|
|
|
mean value: 0.905952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.75 0.83333333 0.75 0.85714286 1.
|
|
0.71428571 1. 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8226190476190476
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01716733 0.01644683 0.01679516 0.01655555 0.01668715 0.0164144
|
|
0.01659274 0.01631236 0.01657629 0.01654887]
|
|
|
|
mean value: 0.016609668731689453
|
|
|
|
key: score_time
|
|
value: [0.01222205 0.01211405 0.01213098 0.01435637 0.01459837 0.01452422
|
|
0.01214623 0.0120852 0.01212072 0.01458526]
|
|
|
|
mean value: 0.013088345527648926
|
|
|
|
key: test_mcc
|
|
value: [-0.28288947 0.07142857 -0.21957752 0.09759001 -0.28288947 0.23809524
|
|
-0.22537447 0.05143445 -0.05143445 0.09759001]
|
|
|
|
mean value: -0.050602711008851185
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.38461538 0.53846154 0.38461538 0.53846154 0.38461538 0.61538462
|
|
0.38461538 0.53846154 0.46153846 0.53846154]
|
|
|
|
mean value: 0.47692307692307695
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.2 0.5 0.42857143 0.57142857 0.2 0.61538462
|
|
0.2 0.625 0.36363636 0.5 ]
|
|
|
|
mean value: 0.4204020979020979
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.25 0.5 0.375 0.5 0.25 0.66666667
|
|
0.33333333 0.55555556 0.5 0.6 ]
|
|
|
|
mean value: 0.45305555555555554
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.16666667 0.5 0.5 0.66666667 0.16666667 0.57142857
|
|
0.14285714 0.71428571 0.28571429 0.42857143]
|
|
|
|
mean value: 0.41428571428571426
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.36904762 0.53571429 0.39285714 0.54761905 0.36904762 0.61904762
|
|
0.4047619 0.52380952 0.47619048 0.54761905]
|
|
|
|
mean value: 0.4785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.11111111 0.33333333 0.27272727 0.4 0.11111111 0.44444444
|
|
0.11111111 0.45454545 0.22222222 0.33333333]
|
|
|
|
mean value: 0.27939393939393936
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.12
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04744411 0.03657794 0.03558445 0.03328729 0.03785634 0.03590202
|
|
0.03718376 0.03619504 0.0340116 0.02791476]
|
|
|
|
mean value: 0.0361957311630249
|
|
|
|
key: score_time
|
|
value: [0.02389693 0.0237062 0.02078629 0.02116799 0.02398634 0.023525
|
|
0.02397728 0.02406883 0.02208757 0.0217185 ]
|
|
|
|
mean value: 0.022892093658447264
|
|
|
|
key: test_mcc
|
|
value: [0.41475753 0.54761905 0.73192505 0.38095238 0.38095238 0.85714286
|
|
0.23809524 0.21957752 0.85391256 0.23809524]
|
|
|
|
mean value: 0.4863029808815056
|
|
|
|
key: train_mcc
|
|
value: [0.96580947 0.93214426 0.94884541 0.89792372 0.89792372 0.93161894
|
|
0.98305085 0.96580947 0.93161894 0.96580947]
|
|
|
|
mean value: 0.9420554222581047
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.76923077 0.84615385 0.69230769 0.69230769 0.92307692
|
|
0.61538462 0.61538462 0.92307692 0.61538462]
|
|
|
|
mean value: 0.7384615384615385
|
|
|
|
key: train_accuracy
|
|
value: [0.98290598 0.96581197 0.97435897 0.94871795 0.94871795 0.96581197
|
|
0.99145299 0.98290598 0.96581197 0.98290598]
|
|
|
|
mean value: 0.9709401709401709
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.85714286 0.66666667 0.66666667 0.92307692
|
|
0.61538462 0.66666667 0.93333333 0.61538462]
|
|
|
|
mean value: 0.7427838827838827
|
|
|
|
key: train_fscore
|
|
value: [0.98305085 0.96666667 0.97478992 0.95 0.95 0.96551724
|
|
0.99145299 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9712512145681603
|
|
|
|
key: test_precision
|
|
value: [0.625 0.71428571 0.75 0.66666667 0.66666667 1.
|
|
0.66666667 0.625 0.875 0.66666667]
|
|
|
|
mean value: 0.7255952380952381
|
|
|
|
key: train_precision
|
|
value: [0.98305085 0.95081967 0.96666667 0.93442623 0.93442623 0.96551724
|
|
0.98305085 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9648992216867394
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.83333333 1. 0.66666667 0.66666667 0.85714286
|
|
0.57142857 0.71428571 1. 0.57142857]
|
|
|
|
mean value: 0.7714285714285715
|
|
|
|
key: train_recall
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:168: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:171: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98305085 0.98305085 0.98305085 0.96610169 0.96610169 0.96551724
|
|
1. 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9777907656341321
|
|
|
|
key: test_roc_auc
|
|
value: [0.70238095 0.77380952 0.85714286 0.69047619 0.69047619 0.92857143
|
|
0.61904762 0.60714286 0.91666667 0.61904762]
|
|
|
|
mean value: 0.7404761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [0.98290473 0.96566335 0.97428404 0.94856809 0.94856809 0.96580947
|
|
0.99152542 0.98290473 0.96580947 0.98290473]
|
|
|
|
mean value: 0.9708942139099942
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.75 0.5 0.5 0.85714286
|
|
0.44444444 0.5 0.875 0.44444444]
|
|
|
|
mean value: 0.6051587301587301
|
|
|
|
key: train_jcc
|
|
value: [0.96666667 0.93548387 0.95081967 0.9047619 0.9047619 0.93333333
|
|
0.98305085 0.96610169 0.93333333 0.96610169]
|
|
|
|
mean value: 0.9444414923244168
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.31864738 0.31613731 0.37270212 0.30933118 0.3532021 0.42899418
|
|
0.34164643 0.27218747 0.27531123 0.27682853]
|
|
|
|
mean value: 0.32649879455566405
|
|
|
|
key: score_time
|
|
value: [0.03583121 0.02314615 0.02238393 0.02144551 0.02395582 0.02265859
|
|
0.02140379 0.02456188 0.02235389 0.02389455]
|
|
|
|
mean value: 0.024163532257080077
|
|
|
|
key: test_mcc
|
|
value: [0.41475753 0.54761905 0.73192505 0.38095238 0.38095238 0.85714286
|
|
0.23809524 0.21957752 0.85391256 0.23809524]
|
|
|
|
mean value: 0.4863029808815056
|
|
|
|
key: train_mcc
|
|
value: [0.96580947 0.98304594 0.94884541 0.89792372 0.89792372 0.93161894
|
|
1. 0.96580947 0.93161894 0.96580947]
|
|
|
|
mean value: 0.9488405049573129
|
|
|
|
key: test_accuracy
|
|
value: [0.69230769 0.76923077 0.84615385 0.69230769 0.69230769 0.92307692
|
|
0.61538462 0.61538462 0.92307692 0.61538462]
|
|
|
|
mean value: 0.7384615384615385
|
|
|
|
key: train_accuracy
|
|
value: [0.98290598 0.99145299 0.97435897 0.94871795 0.94871795 0.96581197
|
|
1. 0.98290598 0.96581197 0.98290598]
|
|
|
|
mean value: 0.9743589743589743
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.85714286 0.66666667 0.66666667 0.92307692
|
|
0.61538462 0.66666667 0.93333333 0.61538462]
|
|
|
|
mean value: 0.7427838827838827
|
|
|
|
key: train_fscore
|
|
value: [0.98305085 0.99159664 0.97478992 0.95 0.95 0.96551724
|
|
1. 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9745989126217407
|
|
|
|
key: test_precision
|
|
value: [0.625 0.71428571 0.75 0.66666667 0.66666667 1.
|
|
0.66666667 0.625 0.875 0.66666667]
|
|
|
|
mean value: 0.7255952380952381
|
|
|
|
key: train_precision
|
|
value: [0.98305085 0.98333333 0.96666667 0.93442623 0.93442623 0.96551724
|
|
1. 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9698455030611952
|
|
|
|
key: test_recall
|
|
value: [0.83333333 0.83333333 1. 0.66666667 0.66666667 0.85714286
|
|
0.57142857 0.71428571 1. 0.57142857]
|
|
|
|
mean value: 0.7714285714285715
|
|
|
|
key: train_recall
|
|
value: [0.98305085 1. 0.98305085 0.96610169 0.96610169 0.96551724
|
|
1. 0.98275862 0.96551724 0.98275862]
|
|
|
|
mean value: 0.9794856808883694
|
|
|
|
key: test_roc_auc
|
|
value: [0.70238095 0.77380952 0.85714286 0.69047619 0.69047619 0.92857143
|
|
0.61904762 0.60714286 0.91666667 0.61904762]
|
|
|
|
mean value: 0.7404761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [0.98290473 0.99137931 0.97428404 0.94856809 0.94856809 0.96580947
|
|
1. 0.98290473 0.96580947 0.98290473]
|
|
|
|
mean value: 0.9743132670952659
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.75 0.5 0.5 0.85714286
|
|
0.44444444 0.5 0.875 0.44444444]
|
|
|
|
mean value: 0.6051587301587301
|
|
|
|
key: train_jcc
|
|
value: [0.96666667 0.98333333 0.95081967 0.9047619 0.9047619 0.93333333
|
|
1. 0.96610169 0.93333333 0.96610169]
|
|
|
|
mean value: 0.9509213538152133
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03482747 0.04060721 0.06156802 0.03389192 0.033283 0.03031182
|
|
0.04303527 0.03335857 0.04897904 0.02938032]
|
|
|
|
mean value: 0.03892426490783692
|
|
|
|
key: score_time
|
|
value: [0.01205015 0.01392388 0.02358866 0.0120697 0.01431918 0.01200342
|
|
0.01424956 0.01427317 0.01218128 0.01201129]
|
|
|
|
mean value: 0.01406702995300293
|
|
|
|
key: test_mcc
|
|
value: [0.53935989 0.52295779 0.71562645 0.80909091 0.53935989 0.80909091
|
|
0.23636364 0.80909091 0.63305416 0.82572282]
|
|
|
|
mean value: 0.6439717368137134
|
|
|
|
key: train_mcc
|
|
value: [0.88405964 0.82054446 0.85264475 0.84171619 0.89500244 0.87301232
|
|
0.92597156 0.852022 0.89495572 0.90519967]
|
|
|
|
mean value: 0.8745128753545499
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.76190476 0.85714286 0.9047619 0.76190476 0.9047619
|
|
0.61904762 0.9047619 0.80952381 0.9047619 ]
|
|
|
|
mean value: 0.819047619047619
|
|
|
|
key: train_accuracy
|
|
value: [0.94179894 0.91005291 0.92592593 0.92063492 0.94708995 0.93650794
|
|
0.96296296 0.92592593 0.94708995 0.95238095]
|
|
|
|
mean value: 0.937037037037037
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.73684211 0.84210526 0.9 0.70588235 0.90909091
|
|
0.63636364 0.90909091 0.8 0.9 ]
|
|
|
|
mean value: 0.8045257528848859
|
|
|
|
key: train_fscore
|
|
value: [0.94117647 0.90909091 0.92473118 0.9197861 0.94623656 0.93617021
|
|
0.96256684 0.92473118 0.94565217 0.95135135]
|
|
|
|
mean value: 0.936149298361715
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.77777778 0.88888889 0.9 0.85714286 0.90909091
|
|
0.63636364 0.90909091 0.88888889 1. ]
|
|
|
|
mean value: 0.8624386724386724
|
|
|
|
key: train_precision
|
|
value: [0.95652174 0.92391304 0.94505495 0.93478261 0.96703297 0.93617021
|
|
0.96774194 0.93478261 0.96666667 0.96703297]
|
|
|
|
mean value: 0.9499699694037375
|
|
|
|
key: test_recall
|
|
value: [0.6 0.7 0.8 0.9 0.6 0.90909091
|
|
0.63636364 0.90909091 0.72727273 0.81818182]
|
|
|
|
mean value: 0.76
|
|
|
|
key: train_recall
|
|
value: [0.92631579 0.89473684 0.90526316 0.90526316 0.92631579 0.93617021
|
|
0.95744681 0.91489362 0.92553191 0.93617021]
|
|
|
|
mean value: 0.9228107502799552
|
|
|
|
key: test_roc_auc
|
|
value: [0.75454545 0.75909091 0.85454545 0.90454545 0.75454545 0.90454545
|
|
0.61818182 0.90454545 0.81363636 0.90909091]
|
|
|
|
mean value: 0.8177272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [0.9418813 0.91013438 0.92603583 0.92071669 0.94720045 0.93650616
|
|
0.96293393 0.92586786 0.94697648 0.95229563]
|
|
|
|
mean value: 0.9370548712206047
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.58333333 0.72727273 0.81818182 0.54545455 0.83333333
|
|
0.46666667 0.83333333 0.66666667 0.81818182]
|
|
|
|
mean value: 0.6837878787878788
|
|
|
|
key: train_jcc
|
|
value: [0.88888889 0.83333333 0.86 0.85148515 0.89795918 0.88
|
|
0.92783505 0.86 0.89690722 0.90721649]
|
|
|
|
mean value: 0.8803625317297141
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.94342947 1.26570821 0.92522335 1.06781173 0.90099669 0.92243218
|
|
0.97618771 0.7888 1.1742022 0.89259076]
|
|
|
|
mean value: 0.9857382297515869
|
|
|
|
key: score_time
|
|
value: [0.01842904 0.01664925 0.01643705 0.016366 0.01821828 0.02390528
|
|
0.01476598 0.01480126 0.01512694 0.01856089]
|
|
|
|
mean value: 0.017325997352600098
|
|
|
|
key: test_mcc
|
|
value: [0.74161985 0.82275335 0.74161985 0.82572282 0.80909091 0.71818182
|
|
0.23636364 1. 0.90829511 0.67419986]
|
|
|
|
mean value: 0.7477847204800199
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.9047619 0.85714286 0.9047619 0.9047619 0.85714286
|
|
0.61904762 1. 0.95238095 0.80952381]
|
|
|
|
mean value: 0.8666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99470899 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994708994708995
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.88888889 0.82352941 0.90909091 0.9 0.85714286
|
|
0.63636364 1. 0.95652174 0.77777778]
|
|
|
|
mean value: 0.8572844631923916
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99470899 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994708994708995
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 0.83333333 0.9 0.9
|
|
0.63636364 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.9186363636363637
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.7 1. 0.9 0.81818182
|
|
0.63636364 1. 1. 0.63636364]
|
|
|
|
mean value: 0.8190909090909091
|
|
|
|
key: train_recall
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.9 0.85 0.90909091 0.90454545 0.85909091
|
|
0.61818182 1. 0.95 0.81818182]
|
|
|
|
mean value: 0.8659090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99473684 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9994736842105263
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.8 0.7 0.83333333 0.81818182 0.75
|
|
0.46666667 1. 0.91666667 0.63636364]
|
|
|
|
mean value: 0.7621212121212121
|
|
|
|
key: train_jcc
|
|
value: [1. 0.98947368 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9989473684210526
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02358484 0.00938153 0.00928259 0.00879526 0.00884533 0.00888205
|
|
0.00890446 0.00898314 0.01293993 0.01311731]
|
|
|
|
mean value: 0.01127164363861084
|
|
|
|
key: score_time
|
|
value: [0.01048207 0.00914145 0.00899744 0.00867033 0.00864816 0.00866389
|
|
0.00865364 0.00863838 0.01307774 0.01077056]
|
|
|
|
mean value: 0.009574365615844727
|
|
|
|
key: test_mcc
|
|
value: [0.35527986 0.23636364 0.60302269 0.35527986 0.21968621 0.58630197
|
|
0.13858047 0.42727273 0.45226702 0.24120908]
|
|
|
|
mean value: 0.3615263506116118
|
|
|
|
key: train_mcc
|
|
value: [0.43194158 0.37741808 0.40940239 0.42563559 0.39658396 0.43701355
|
|
0.42107287 0.48299607 0.34600551 0.40559385]
|
|
|
|
mean value: 0.41336634360677543
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.61904762 0.76190476 0.66666667 0.57142857 0.76190476
|
|
0.57142857 0.71428571 0.71428571 0.61904762]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.7037037 0.66666667 0.69312169 0.7037037 0.68783069 0.70899471
|
|
0.6984127 0.74074074 0.62962963 0.68783069]
|
|
|
|
mean value: 0.692063492063492
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 0.6 0.8 0.69565217 0.66666667 0.81481481
|
|
0.66666667 0.72727273 0.76923077 0.69230769]
|
|
|
|
mean value: 0.7128263684785424
|
|
|
|
key: train_fscore
|
|
value: [0.74774775 0.73191489 0.73873874 0.74311927 0.73303167 0.74418605
|
|
0.73972603 0.74871795 0.72 0.73542601]
|
|
|
|
mean value: 0.7382608351962145
|
|
|
|
key: test_precision
|
|
value: [0.61538462 0.6 0.66666667 0.61538462 0.52941176 0.6875
|
|
0.5625 0.72727273 0.66666667 0.6 ]
|
|
|
|
mean value: 0.6270787056081174
|
|
|
|
key: train_precision
|
|
value: [0.65354331 0.61428571 0.64566929 0.65853659 0.64285714 0.66115702
|
|
0.648 0.72277228 0.57692308 0.63565891]
|
|
|
|
mean value: 0.6459403334606778
|
|
|
|
key: test_recall
|
|
value: [0.8 0.6 1. 0.8 0.9 1.
|
|
0.81818182 0.72727273 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8372727272727273
|
|
|
|
key: train_recall
|
|
value: [0.87368421 0.90526316 0.86315789 0.85263158 0.85263158 0.85106383
|
|
0.86170213 0.77659574 0.95744681 0.87234043]
|
|
|
|
mean value: 0.8666517357222845
|
|
|
|
key: test_roc_auc
|
|
value: [0.67272727 0.61818182 0.77272727 0.67272727 0.58636364 0.75
|
|
0.55909091 0.71363636 0.70454545 0.60909091]
|
|
|
|
mean value: 0.6659090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [0.70279955 0.66539754 0.69221725 0.70291153 0.68695409 0.70974244
|
|
0.69927212 0.74092945 0.63135498 0.68880179]
|
|
|
|
mean value: 0.6920380739081747
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 0.42857143 0.66666667 0.53333333 0.5 0.6875
|
|
0.5 0.57142857 0.625 0.52941176]
|
|
|
|
mean value: 0.5575245098039215
|
|
|
|
key: train_jcc
|
|
value: [0.5971223 0.57718121 0.58571429 0.59124088 0.57857143 0.59259259
|
|
0.58695652 0.59836066 0.5625 0.58156028]
|
|
|
|
mean value: 0.5851800154167459
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01514578 0.0143559 0.01509309 0.01282573 0.00945067 0.00921535
|
|
0.00978017 0.00969481 0.00936055 0.00989962]
|
|
|
|
mean value: 0.011482167243957519
|
|
|
|
key: score_time
|
|
value: [0.01314116 0.01436496 0.014256 0.01244879 0.00897789 0.00877929
|
|
0.00870728 0.00949979 0.00937581 0.00883293]
|
|
|
|
mean value: 0.01083838939666748
|
|
|
|
key: test_mcc
|
|
value: [ 0.23636364 -0.06741999 0.61818182 0.53935989 -0.04545455 0.24771685
|
|
0.03739788 0.33636364 0.55161872 0.24771685]
|
|
|
|
mean value: 0.2701844747191005
|
|
|
|
key: train_mcc
|
|
value: [0.44056694 0.44988241 0.45158615 0.40571724 0.47104939 0.40741573
|
|
0.48340519 0.41808615 0.40741573 0.50382186]
|
|
|
|
mean value: 0.4438946787181437
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.47619048 0.80952381 0.76190476 0.47619048 0.61904762
|
|
0.52380952 0.66666667 0.76190476 0.61904762]
|
|
|
|
mean value: 0.6333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.71957672 0.72486772 0.72486772 0.6984127 0.73544974 0.7037037
|
|
0.74074074 0.70899471 0.7037037 0.75132275]
|
|
|
|
mean value: 0.7211640211640211
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.35294118 0.8 0.70588235 0.47619048 0.6
|
|
0.58333333 0.66666667 0.73684211 0.6 ]
|
|
|
|
mean value: 0.6121856110865399
|
|
|
|
key: train_fscore
|
|
value: [0.71038251 0.72340426 0.71428571 0.66666667 0.73404255 0.69892473
|
|
0.72625698 0.7027027 0.69892473 0.74033149]
|
|
|
|
mean value: 0.7115922343145447
|
|
|
|
key: test_precision
|
|
value: [0.6 0.42857143 0.8 0.85714286 0.45454545 0.66666667
|
|
0.53846154 0.7 0.875 0.66666667]
|
|
|
|
mean value: 0.6587054612054611
|
|
|
|
key: train_precision
|
|
value: [0.73863636 0.7311828 0.74712644 0.75 0.74193548 0.70652174
|
|
0.76470588 0.71428571 0.70652174 0.77011494]
|
|
|
|
mean value: 0.7371031097416126
|
|
|
|
key: test_recall
|
|
value: [0.6 0.3 0.8 0.6 0.5 0.54545455
|
|
0.63636364 0.63636364 0.63636364 0.54545455]
|
|
|
|
mean value: 0.58
|
|
|
|
key: train_recall
|
|
value: [0.68421053 0.71578947 0.68421053 0.6 0.72631579 0.69148936
|
|
0.69148936 0.69148936 0.69148936 0.71276596]
|
|
|
|
mean value: 0.6889249720044793
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.46818182 0.80909091 0.75454545 0.47727273 0.62272727
|
|
0.51818182 0.66818182 0.76818182 0.62272727]
|
|
|
|
mean value: 0.6327272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [0.71976484 0.72491601 0.72508399 0.69893617 0.73549832 0.70363942
|
|
0.74048152 0.70890258 0.70363942 0.75111982]
|
|
|
|
mean value: 0.7211982082866741
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.21428571 0.66666667 0.54545455 0.3125 0.42857143
|
|
0.41176471 0.5 0.58333333 0.42857143]
|
|
|
|
mean value: 0.45197192513368983
|
|
|
|
key: train_jcc
|
|
value: [0.55084746 0.56666667 0.55555556 0.5 0.57983193 0.53719008
|
|
0.57017544 0.54166667 0.53719008 0.5877193 ]
|
|
|
|
mean value: 0.5526843181420478
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01027989 0.01051307 0.00920153 0.00923085 0.00902748 0.00923491
|
|
0.01056838 0.01076412 0.01129293 0.01068258]
|
|
|
|
mean value: 0.010079574584960938
|
|
|
|
key: score_time
|
|
value: [0.01671553 0.01708245 0.01061392 0.01051021 0.01748228 0.01556611
|
|
0.01171184 0.01431727 0.01263571 0.01222897]
|
|
|
|
mean value: 0.013886427879333496
|
|
|
|
key: test_mcc
|
|
value: [-0.14545455 0.13483997 0.23373675 0.14545455 -0.39196475 -0.24771685
|
|
-0.33709993 0.33636364 0.15894099 0.05504819]
|
|
|
|
mean value: -0.0057851993419990025
|
|
|
|
key: train_mcc
|
|
value: [0.45024663 0.42906778 0.42923006 0.46035834 0.43243527 0.49287375
|
|
0.50280155 0.43919373 0.48199732 0.43065616]
|
|
|
|
mean value: 0.45488605817766276
|
|
|
|
key: test_accuracy
|
|
value: [0.42857143 0.57142857 0.61904762 0.57142857 0.33333333 0.38095238
|
|
0.33333333 0.66666667 0.57142857 0.52380952]
|
|
|
|
mean value: 0.5
|
|
|
|
key: train_accuracy
|
|
value: [0.72486772 0.71428571 0.71428571 0.73015873 0.71428571 0.74603175
|
|
0.75132275 0.71957672 0.74074074 0.71428571]
|
|
|
|
mean value: 0.726984126984127
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.47058824 0.55555556 0.57142857 0.125 0.43478261
|
|
0.22222222 0.66666667 0.52631579 0.5 ]
|
|
|
|
mean value: 0.447255964933647
|
|
|
|
key: train_fscore
|
|
value: [0.72043011 0.70967742 0.7244898 0.73015873 0.69662921 0.73626374
|
|
0.74594595 0.71957672 0.73224044 0.69662921]
|
|
|
|
mean value: 0.7212041318869982
|
|
|
|
key: test_precision
|
|
value: [0.4 0.57142857 0.625 0.54545455 0.16666667 0.41666667
|
|
0.28571429 0.7 0.625 0.55555556]
|
|
|
|
mean value: 0.48914862914862917
|
|
|
|
key: train_precision
|
|
value: [0.73626374 0.72527473 0.7029703 0.73404255 0.74698795 0.76136364
|
|
0.75824176 0.71578947 0.75280899 0.73809524]
|
|
|
|
mean value: 0.7371838358715771
|
|
|
|
key: test_recall
|
|
value: [0.4 0.4 0.5 0.6 0.1 0.45454545
|
|
0.18181818 0.63636364 0.45454545 0.45454545]
|
|
|
|
mean value: 0.41818181818181815
|
|
|
|
key: train_recall
|
|
value: [0.70526316 0.69473684 0.74736842 0.72631579 0.65263158 0.71276596
|
|
0.73404255 0.72340426 0.71276596 0.65957447]
|
|
|
|
mean value: 0.7068868980963046
|
|
|
|
key: test_roc_auc
|
|
value: [0.42727273 0.56363636 0.61363636 0.57272727 0.32272727 0.37727273
|
|
0.34090909 0.66818182 0.57727273 0.52727273]
|
|
|
|
mean value: 0.49909090909090903
|
|
|
|
key: train_roc_auc
|
|
value: [0.724972 0.7143897 0.71410974 0.73017917 0.71461366 0.74585666
|
|
0.7512318 0.71959686 0.74059351 0.71399776]
|
|
|
|
mean value: 0.7269540873460246
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.30769231 0.38461538 0.4 0.06666667 0.27777778
|
|
0.125 0.5 0.35714286 0.33333333]
|
|
|
|
mean value: 0.30022283272283273
|
|
|
|
key: train_jcc
|
|
value: [0.56302521 0.55 0.568 0.575 0.53448276 0.5826087
|
|
0.59482759 0.56198347 0.57758621 0.53448276]
|
|
|
|
mean value: 0.5641996687155415
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline:/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01432991 0.01354694 0.01185727 0.01176095 0.01348209 0.01326251
|
|
0.01258087 0.01316094 0.01310778 0.01367092]
|
|
|
|
mean value: 0.013076019287109376
|
|
|
|
key: score_time
|
|
value: [0.01142573 0.01127124 0.00964952 0.0101819 0.01053739 0.01043487
|
|
0.01040888 0.01022005 0.01001334 0.01339149]
|
|
|
|
mean value: 0.010753440856933593
|
|
|
|
key: test_mcc
|
|
value: [ 0.42727273 -0.06741999 0.80909091 0.06741999 0.13762047 0.42727273
|
|
0.04545455 0.52295779 0.71818182 0.61818182]
|
|
|
|
mean value: 0.37060328045303614
|
|
|
|
key: train_mcc
|
|
value: [0.69399986 0.73585755 0.80972114 0.74663724 0.75702928 0.78850682
|
|
0.8636019 0.77999992 0.79896965 0.79930542]
|
|
|
|
mean value: 0.7773628790574287
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.47619048 0.9047619 0.52380952 0.57142857 0.71428571
|
|
0.52380952 0.76190476 0.85714286 0.80952381]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.84656085 0.86772487 0.9047619 0.87301587 0.87830688 0.89417989
|
|
0.93121693 0.88888889 0.8994709 0.8994709 ]
|
|
|
|
mean value: 0.8883597883597883
|
|
|
|
key: test_fscore
|
|
value: [0.7 0.35294118 0.9 0.58333333 0.52631579 0.72727273
|
|
0.54545455 0.7826087 0.85714286 0.81818182]
|
|
|
|
mean value: 0.6793250942981728
|
|
|
|
key: train_fscore
|
|
value: [0.85128205 0.86631016 0.90425532 0.87628866 0.87700535 0.89247312
|
|
0.92896175 0.89230769 0.89839572 0.8972973 ]
|
|
|
|
mean value: 0.8884577116689765
|
|
|
|
key: test_precision
|
|
value: [0.7 0.42857143 0.9 0.5 0.55555556 0.72727273
|
|
0.54545455 0.75 0.9 0.81818182]
|
|
|
|
mean value: 0.6825036075036075
|
|
|
|
key: train_precision
|
|
value: [0.83 0.88043478 0.91397849 0.85858586 0.89130435 0.90217391
|
|
0.95505618 0.86138614 0.90322581 0.91208791]
|
|
|
|
mean value: 0.8908233433616443
|
|
|
|
key: test_recall
|
|
value: [0.7 0.3 0.9 0.7 0.5 0.72727273
|
|
0.54545455 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.6827272727272727
|
|
|
|
key: train_recall
|
|
value: [0.87368421 0.85263158 0.89473684 0.89473684 0.86315789 0.88297872
|
|
0.90425532 0.92553191 0.89361702 0.88297872]
|
|
|
|
mean value: 0.8868309070548712
|
|
|
|
key: test_roc_auc
|
|
value: [0.71363636 0.46818182 0.90454545 0.53181818 0.56818182 0.71363636
|
|
0.52272727 0.75909091 0.85909091 0.80909091]
|
|
|
|
mean value: 0.6849999999999999
|
|
|
|
key: train_roc_auc
|
|
value: [0.84641657 0.86780515 0.90481523 0.87290034 0.87838746 0.89412094
|
|
0.93107503 0.88908175 0.89944009 0.8993841 ]
|
|
|
|
mean value: 0.8883426651735722
|
|
|
|
key: test_jcc
|
|
value: [0.53846154 0.21428571 0.81818182 0.41176471 0.35714286 0.57142857
|
|
0.375 0.64285714 0.75 0.69230769]
|
|
|
|
mean value: 0.5371430040547688
|
|
|
|
key: train_jcc
|
|
value: [0.74107143 0.76415094 0.82524272 0.77981651 0.78095238 0.80582524
|
|
0.86734694 0.80555556 0.81553398 0.81372549]
|
|
|
|
mean value: 0.7999221192956221
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.33641791 1.50435567 1.61624503 1.38623667 1.66725993 1.31514406
|
|
1.19694066 1.21551633 1.49928236 1.54568815]
|
|
|
|
mean value: 1.4283086776733398
|
|
|
|
key: score_time
|
|
value: [0.01264334 0.01469707 0.02935648 0.02213526 0.01919293 0.01318979
|
|
0.01264215 0.01259017 0.01352763 0.01921105]
|
|
|
|
mean value: 0.016918587684631347
|
|
|
|
key: test_mcc
|
|
value: [0.43007562 0.53935989 0.90829511 0.80909091 0.71818182 0.4719399
|
|
0.33028913 0.80909091 0.63305416 0.67419986]
|
|
|
|
mean value: 0.6323577308640445
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.76190476 0.95238095 0.9047619 0.85714286 0.71428571
|
|
0.66666667 0.9047619 0.80952381 0.80952381]
|
|
|
|
mean value: 0.8095238095238095
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.70588235 0.94736842 0.9 0.85714286 0.66666667
|
|
0.69565217 0.90909091 0.8 0.77777778]
|
|
|
|
mean value: 0.7926247825251729
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[0.75 0.85714286 1. 0.9 0.81818182 0.85714286
|
|
0.66666667 0.90909091 0.88888889 1. ]
|
|
|
|
mean value: 0.8647113997113997
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.6 0.9 0.9 0.9 0.54545455
|
|
0.72727273 0.90909091 0.72727273 0.63636364]
|
|
|
|
mean value: 0.7445454545454545
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.70909091 0.75454545 0.95 0.90454545 0.85909091 0.72272727
|
|
0.66363636 0.90454545 0.81363636 0.81818182]
|
|
|
|
mean value: 0.8099999999999999
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.54545455 0.9 0.81818182 0.75 0.5
|
|
0.53333333 0.83333333 0.66666667 0.63636364]
|
|
|
|
mean value: 0.6683333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01775002 0.01362348 0.01367092 0.01293755 0.01265097 0.01258326
|
|
0.01258826 0.01409173 0.01307774 0.01271462]
|
|
|
|
mean value: 0.013568854331970215
|
|
|
|
key: score_time
|
|
value: [0.01175952 0.00916195 0.00893497 0.00862074 0.00865817 0.00863695
|
|
0.0087688 0.00872326 0.00884724 0.00894094]
|
|
|
|
mean value: 0.009105253219604491
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.82275335 1. 0.90829511 1. 1.
|
|
0.42727273 0.90909091 0.82572282 0.80909091]
|
|
|
|
mean value: 0.8524979177943448
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.9047619 1. 0.95238095 1. 1.
|
|
0.71428571 0.95238095 0.9047619 0.9047619 ]
|
|
|
|
mean value: 0.9238095238095239
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 1. 0.94736842 1. 1.
|
|
0.72727273 0.95238095 0.9 0.90909091]
|
|
|
|
mean value: 0.9213890787574999
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.72727273 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9636363636363636
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 1. 0.9 1. 1.
|
|
0.72727273 0.90909091 0.81818182 0.90909091]
|
|
|
|
mean value: 0.8863636363636364
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.9 1. 0.95 1. 1.
|
|
0.71363636 0.95454545 0.90909091 0.90454545]
|
|
|
|
mean value: 0.9231818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 1. 0.9 1. 1.
|
|
0.57142857 0.90909091 0.81818182 0.83333333]
|
|
|
|
mean value: 0.8632034632034632
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09987569 0.10673809 0.11479974 0.10427046 0.10179067 0.12284112
|
|
0.14259434 0.10524893 0.10025811 0.09575725]
|
|
|
|
mean value: 0.10941743850708008
|
|
|
|
key: score_time
|
|
value: [0.02278996 0.02107644 0.02003169 0.01899815 0.01821494 0.0267005
|
|
0.02168226 0.01780367 0.01765037 0.01754332]
|
|
|
|
mean value: 0.020249128341674805
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.71562645 0.90829511 0.61818182 0.82572282 0.52727273
|
|
0.23636364 0.80909091 0.71818182 0.90909091]
|
|
|
|
mean value: 0.7090579546795412
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.85714286 0.95238095 0.80952381 0.9047619 0.76190476
|
|
0.61904762 0.9047619 0.85714286 0.95238095]
|
|
|
|
mean value: 0.8523809523809524
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.84210526 0.94736842 0.8 0.90909091 0.76190476
|
|
0.63636364 0.90909091 0.85714286 0.95238095]
|
|
|
|
mean value: 0.8504336599073441
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 1. 0.8 0.83333333 0.8
|
|
0.63636364 0.90909091 0.9 1. ]
|
|
|
|
mean value: 0.8767676767676768
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 0.9 0.8 1. 0.72727273
|
|
0.63636364 0.90909091 0.81818182 0.90909091]
|
|
|
|
mean value: 0.8300000000000001
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.85454545 0.95 0.80909091 0.90909091 0.76363636
|
|
0.61818182 0.90454545 0.85909091 0.95454545]
|
|
|
|
mean value: 0.8522727272727273
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.72727273 0.9 0.66666667 0.83333333 0.61538462
|
|
0.46666667 0.83333333 0.75 0.90909091]
|
|
|
|
mean value: 0.7501748251748251
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01057601 0.01102757 0.01102114 0.01060414 0.0107491 0.01077795
|
|
0.01005793 0.01006055 0.01025558 0.00970602]
|
|
|
|
mean value: 0.010483598709106446
|
|
|
|
key: score_time
|
|
value: [0.01038694 0.01018834 0.00997496 0.00985217 0.00996852 0.00999904
|
|
0.00966644 0.00945711 0.00876474 0.0086937 ]
|
|
|
|
mean value: 0.009695196151733398
|
|
|
|
key: test_mcc
|
|
value: [0.43007562 0.58630197 0.80909091 0.44038551 0.52295779 0.35527986
|
|
0.13762047 0.26967994 0.52727273 0.67419986]
|
|
|
|
mean value: 0.47528646505235683
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.76190476 0.9047619 0.71428571 0.76190476 0.66666667
|
|
0.57142857 0.61904762 0.76190476 0.80952381]
|
|
|
|
mean value: 0.7285714285714285
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.9 0.72727273 0.73684211 0.63157895
|
|
0.60869565 0.55555556 0.76190476 0.77777778]
|
|
|
|
mean value: 0.7032960860649647
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 1. 0.9 0.66666667 0.77777778 0.75
|
|
0.58333333 0.71428571 0.8 1. ]
|
|
|
|
mean value: 0.7942063492063492
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.9 0.8 0.7 0.54545455
|
|
0.63636364 0.45454545 0.72727273 0.63636364]
|
|
|
|
mean value: 0.65
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.70909091 0.75 0.90454545 0.71818182 0.75909091 0.67272727
|
|
0.56818182 0.62727273 0.76363636 0.81818182]
|
|
|
|
mean value: 0.7290909090909091
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.81818182 0.57142857 0.58333333 0.46153846
|
|
0.4375 0.38461538 0.61538462 0.63636364]
|
|
|
|
mean value: 0.5508345820845821
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.32293153 1.42478061 1.48231292 1.47654915 1.48469901 1.48599815
|
|
1.46488094 1.48408747 1.48369265 1.48373985]
|
|
|
|
mean value: 1.4593672275543212
|
|
|
|
key: score_time
|
|
value: [0.10207772 0.10522914 0.10460138 0.10517383 0.10500145 0.10545468
|
|
0.10445786 0.10404372 0.1047914 0.10368395]
|
|
|
|
mean value: 0.10445151329040528
|
|
|
|
key: test_mcc
|
|
value: [0.66332496 0.53935989 0.90829511 0.71562645 0.90909091 0.90829511
|
|
0.33028913 0.90909091 0.90829511 0.90909091]
|
|
|
|
mean value: 0.7700758470872185
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.76190476 0.95238095 0.85714286 0.95238095 0.95238095
|
|
0.66666667 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.8809523809523809
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.70588235 0.94736842 0.84210526 0.95238095 0.95652174
|
|
0.69565217 0.95238095 0.95652174 0.95238095]
|
|
|
|
mean value: 0.8711194546468473
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.85714286 1. 0.88888889 0.90909091 0.91666667
|
|
0.66666667 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.9155122655122655
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6 0.6 0.9 0.8 1. 1.
|
|
0.72727273 0.90909091 1. 0.90909091]
|
|
|
|
mean value: 0.8445454545454545
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 0.75454545 0.95 0.85454545 0.95454545 0.95
|
|
0.66363636 0.95454545 0.95 0.95454545]
|
|
|
|
mean value: 0.8786363636363637
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.6 0.54545455 0.9 0.72727273 0.90909091 0.91666667
|
|
0.53333333 0.90909091 0.91666667 0.90909091]
|
|
|
|
mean value: 0.7866666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.00507212 1.5582056 1.35330391 0.88756967 0.90433073 1.32030916
|
|
1.99715137 0.90522766 0.9060688 0.88190556]
|
|
|
|
mean value: 1.1719144582748413
|
|
|
|
key: score_time
|
|
value: [0.14654708 0.13812089 0.14643788 0.17348289 0.18689179 0.25120091
|
|
0.13585401 0.22276163 0.19270754 0.18566346]
|
|
|
|
mean value: 0.17796680927276612
|
|
|
|
key: test_mcc
|
|
value: [0.74161985 0.52295779 0.82275335 0.53935989 0.71818182 0.90829511
|
|
0.13762047 0.82572282 0.71818182 0.71818182]
|
|
|
|
mean value: 0.6652874733582891
|
|
|
|
key: train_mcc
|
|
value: [0.95767077 0.95788064 0.96830553 0.95767077 0.95788064 0.96830907
|
|
0.98947368 0.96830907 0.96830553 0.95789003]
|
|
|
|
mean value: 0.9651695734907783
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.76190476 0.9047619 0.76190476 0.85714286 0.95238095
|
|
0.57142857 0.9047619 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.97883598 0.97883598 0.98412698 0.97883598 0.97883598 0.98412698
|
|
0.99470899 0.98412698 0.98412698 0.97883598]
|
|
|
|
mean value: 0.9825396825396825
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.73684211 0.88888889 0.70588235 0.85714286 0.95652174
|
|
0.60869565 0.9 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8191788721590848
|
|
|
|
key: train_fscore
|
|
value: [0.97894737 0.97916667 0.98429319 0.97894737 0.97916667 0.98412698
|
|
0.99470899 0.98412698 0.98395722 0.97894737]
|
|
|
|
mean value: 0.9826388814528069
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 1. 0.85714286 0.81818182 0.91666667
|
|
0.58333333 1. 0.9 0.9 ]
|
|
|
|
mean value: 0.8753102453102453
|
|
|
|
key: train_precision
|
|
value: [0.97894737 0.96907216 0.97916667 0.97894737 0.96907216 0.97894737
|
|
0.98947368 0.97894737 0.98924731 0.96875 ]
|
|
|
|
mean value: 0.9780571466286268
|
|
|
|
key: test_recall
|
|
value: [0.7 0.7 0.8 0.6 0.9 1.
|
|
0.63636364 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.7790909090909091
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.98947368 0.98947368 0.97894737 0.98947368 0.9893617
|
|
1. 0.9893617 0.9787234 0.9893617 ]
|
|
|
|
mean value: 0.9873124300111982
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.75909091 0.9 0.75454545 0.85909091 0.95
|
|
0.56818182 0.90909091 0.85909091 0.85909091]
|
|
|
|
mean value: 0.8268181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [0.97883539 0.9787794 0.98409854 0.97883539 0.9787794 0.98415454
|
|
0.99473684 0.98415454 0.98409854 0.97889138]
|
|
|
|
mean value: 0.9825363941769317
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.58333333 0.8 0.54545455 0.75 0.91666667
|
|
0.4375 0.81818182 0.75 0.75 ]
|
|
|
|
mean value: 0.7051136363636363
|
|
|
|
key: train_jcc
|
|
value: [0.95876289 0.95918367 0.96907216 0.95876289 0.95918367 0.96875
|
|
0.98947368 0.96875 0.96842105 0.95876289]
|
|
|
|
mean value: 0.9659122908523149
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02238894 0.00916934 0.00952101 0.01010132 0.01035404 0.01048112
|
|
0.01021218 0.01014829 0.00954199 0.01029658]
|
|
|
|
mean value: 0.011221480369567872
|
|
|
|
key: score_time
|
|
value: [0.00930572 0.0087676 0.00974298 0.00951648 0.00947809 0.00952435
|
|
0.00950789 0.00956464 0.00935626 0.00951338]
|
|
|
|
mean value: 0.009427738189697266
|
|
|
|
key: test_mcc
|
|
value: [ 0.23636364 -0.06741999 0.61818182 0.53935989 -0.04545455 0.24771685
|
|
0.03739788 0.33636364 0.55161872 0.24771685]
|
|
|
|
mean value: 0.2701844747191005
|
|
|
|
key: train_mcc
|
|
value: [0.44056694 0.44988241 0.45158615 0.40571724 0.47104939 0.40741573
|
|
0.48340519 0.41808615 0.40741573 0.50382186]
|
|
|
|
mean value: 0.4438946787181437
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.47619048 0.80952381 0.76190476 0.47619048 0.61904762
|
|
0.52380952 0.66666667 0.76190476 0.61904762]
|
|
|
|
mean value: 0.6333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.71957672 0.72486772 0.72486772 0.6984127 0.73544974 0.7037037
|
|
0.74074074 0.70899471 0.7037037 0.75132275]
|
|
|
|
mean value: 0.7211640211640211
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.35294118 0.8 0.70588235 0.47619048 0.6
|
|
0.58333333 0.66666667 0.73684211 0.6 ]
|
|
|
|
mean value: 0.6121856110865399
|
|
|
|
key: train_fscore
|
|
value: [0.71038251 0.72340426 0.71428571 0.66666667 0.73404255 0.69892473
|
|
0.72625698 0.7027027 0.69892473 0.74033149]
|
|
|
|
mean value: 0.7115922343145447
|
|
|
|
key: test_precision
|
|
value: [0.6 0.42857143 0.8 0.85714286 0.45454545 0.66666667
|
|
0.53846154 0.7 0.875 0.66666667]
|
|
|
|
mean value: 0.6587054612054611
|
|
|
|
key: train_precision
|
|
value: [0.73863636 0.7311828 0.74712644 0.75 0.74193548 0.70652174
|
|
0.76470588 0.71428571 0.70652174 0.77011494]
|
|
|
|
mean value: 0.7371031097416126
|
|
|
|
key: test_recall
|
|
value: [0.6 0.3 0.8 0.6 0.5 0.54545455
|
|
0.63636364 0.63636364 0.63636364 0.54545455]
|
|
|
|
mean value: 0.58
|
|
|
|
key: train_recall
|
|
value: [0.68421053 0.71578947 0.68421053 0.6 0.72631579 0.69148936
|
|
0.69148936 0.69148936 0.69148936 0.71276596]
|
|
|
|
mean value: 0.6889249720044793
|
|
|
|
key: test_roc_auc
|
|
value: [0.61818182 0.46818182 0.80909091 0.75454545 0.47727273 0.62272727
|
|
0.51818182 0.66818182 0.76818182 0.62272727]
|
|
|
|
mean value: 0.6327272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [0.71976484 0.72491601 0.72508399 0.69893617 0.73549832 0.70363942
|
|
0.74048152 0.70890258 0.70363942 0.75111982]
|
|
|
|
mean value: 0.7211982082866741
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.21428571 0.66666667 0.54545455 0.3125 0.42857143
|
|
0.41176471 0.5 0.58333333 0.42857143]
|
|
|
|
mean value: 0.45197192513368983
|
|
|
|
key: train_jcc
|
|
value: [0.55084746 0.56666667 0.55555556 0.5 0.57983193 0.53719008
|
|
0.57017544 0.54166667 0.53719008 0.5877193 ]
|
|
|
|
mean value: 0.5526843181420478
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'Z...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [2.61034989 5.89738679 5.05655122 4.04000521 4.81949782 4.60966992
|
|
4.59260941 4.32805228 4.22248673 2.80192494]
|
|
|
|
mean value: 4.297853422164917
|
|
|
|
key: score_time
|
|
value: [0.02688909 0.02042317 0.02070355 0.01861191 0.02235937 0.0204308
|
|
0.02161312 0.02076817 0.02501345 0.0135026 ]
|
|
|
|
mean value: 0.021031522750854494
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.82275335 0.82275335 1. 1. 0.90829511
|
|
0.62641448 1. 1. 0.80909091]
|
|
|
|
mean value: 0.881206055224815
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.9047619 0.9047619 1. 1. 0.95238095
|
|
0.80952381 1. 1. 0.9047619 ]
|
|
|
|
mean value: 0.9380952380952381
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.88888889 1. 1. 0.95652174
|
|
0.83333333 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9365612648221344
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.91666667
|
|
0.76923077 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9594988344988344
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 0.8 1. 1. 1.
|
|
0.90909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9218181818181819
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.9 0.9 1. 1. 0.95
|
|
0.80454545 1. 1. 0.90454545]
|
|
|
|
mean value: 0.9359090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.8 1. 1. 0.91666667
|
|
0.71428571 1. 1. 0.83333333]
|
|
|
|
mean value: 0.8864285714285715
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06827331 0.07484531 0.05876684 0.03986669 0.06439209 0.07714224
|
|
0.06403208 0.0616889 0.03780341 0.04415965]
|
|
|
|
mean value: 0.0590970516204834
|
|
|
|
key: score_time
|
|
value: [0.02553988 0.02265811 0.02189136 0.01328444 0.01226473 0.01283622
|
|
0.02830124 0.02277637 0.01287508 0.0122354 ]
|
|
|
|
mean value: 0.01846628189086914
|
|
|
|
key: test_mcc
|
|
value: [0.66332496 0.45226702 0.43007562 0.63305416 0.44038551 0.71818182
|
|
0.60302269 0.71562645 0.71818182 0.67419986]
|
|
|
|
mean value: 0.6048319896654357
|
|
|
|
key: train_mcc
|
|
value: [0.98947251 0.93650616 0.97883539 0.98947368 0.97883539 0.95767077
|
|
0.94755736 0.93650616 0.94755736 0.96873621]
|
|
|
|
mean value: 0.9631150997764043
|
|
|
|
key: test_accuracy
|
|
value: [0.80952381 0.71428571 0.71428571 0.80952381 0.71428571 0.85714286
|
|
0.76190476 0.85714286 0.85714286 0.80952381]
|
|
|
|
mean value: 0.7904761904761904
|
|
|
|
key: train_accuracy
|
|
value: [0.99470899 0.96825397 0.98941799 0.99470899 0.98941799 0.97883598
|
|
0.97354497 0.96825397 0.97354497 0.98412698]
|
|
|
|
mean value: 0.9814814814814814
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.625 0.66666667 0.81818182 0.72727273 0.85714286
|
|
0.70588235 0.86956522 0.85714286 0.77777778]
|
|
|
|
mean value: 0.7654632274517185
|
|
|
|
key: train_fscore
|
|
value: [0.9947644 0.96842105 0.98947368 0.99470899 0.98947368 0.9787234
|
|
0.97297297 0.96808511 0.97297297 0.98378378]
|
|
|
|
mean value: 0.9813380054035413
|
|
|
|
key: test_precision
|
|
value: [1. 0.83333333 0.75 0.75 0.66666667 0.9
|
|
1. 0.83333333 0.9 1. ]
|
|
|
|
mean value: 0.8633333333333333
|
|
|
|
key: train_precision
|
|
value: [0.98958333 0.96842105 0.98947368 1. 0.98947368 0.9787234
|
|
0.98901099 0.96808511 0.98901099 1. ]
|
|
|
|
mean value: 0.9861782243046241
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.6 0.9 0.8 0.81818182
|
|
0.54545455 0.90909091 0.81818182 0.63636364]
|
|
|
|
mean value: 0.7127272727272728
|
|
|
|
key: train_recall
|
|
value: [1. 0.96842105 0.98947368 0.98947368 0.98947368 0.9787234
|
|
0.95744681 0.96808511 0.95744681 0.96808511]
|
|
|
|
mean value: 0.9766629339305711
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 0.70454545 0.70909091 0.81363636 0.71818182 0.85909091
|
|
0.77272727 0.85454545 0.85909091 0.81818182]
|
|
|
|
mean value: 0.7909090909090909
|
|
|
|
key: train_roc_auc
|
|
value: [0.99468085 0.96825308 0.98941769 0.99473684 0.98941769 0.97883539
|
|
0.97346025 0.96825308 0.97346025 0.98404255]
|
|
|
|
mean value: 0.9814557670772677
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.45454545 0.5 0.69230769 0.57142857 0.75
|
|
0.54545455 0.76923077 0.75 0.63636364]
|
|
|
|
mean value: 0.6269330669330669
|
|
|
|
key: train_jcc
|
|
value: [0.98958333 0.93877551 0.97916667 0.98947368 0.97916667 0.95833333
|
|
0.94736842 0.93814433 0.94736842 0.96808511]
|
|
|
|
mean value: 0.9635465472799757
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02180147 0.01045227 0.01006985 0.0100987 0.00979686 0.00981426
|
|
0.00987458 0.01014495 0.01003695 0.01014376]
|
|
|
|
mean value: 0.011223363876342773
|
|
|
|
key: score_time
|
|
value: [0.00946331 0.00991654 0.00950646 0.00958514 0.00926518 0.00935149
|
|
0.00945377 0.00949931 0.00967503 0.00960541]
|
|
|
|
mean value: 0.00953216552734375
|
|
|
|
key: test_mcc
|
|
value: [0.14545455 0.13762047 0.63305416 0.33028913 0.30914104 0.45226702
|
|
0.13762047 0.61818182 0.71818182 0.23373675]
|
|
|
|
mean value: 0.3715547220669732
|
|
|
|
key: train_mcc
|
|
value: [0.41147388 0.40281841 0.38806379 0.41147388 0.39005594 0.40105488
|
|
0.40396007 0.42240682 0.43243527 0.41239882]
|
|
|
|
mean value: 0.40761417768557445
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.57142857 0.80952381 0.66666667 0.61904762 0.71428571
|
|
0.57142857 0.80952381 0.85714286 0.61904762]
|
|
|
|
mean value: 0.680952380952381
|
|
|
|
key: train_accuracy
|
|
value: [0.7037037 0.6984127 0.69312169 0.7037037 0.69312169 0.6984127
|
|
0.6984127 0.70899471 0.71428571 0.7037037 ]
|
|
|
|
mean value: 0.7015873015873015
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.52631579 0.81818182 0.63157895 0.69230769 0.76923077
|
|
0.60869565 0.81818182 0.85714286 0.66666667]
|
|
|
|
mean value: 0.6959730582156212
|
|
|
|
key: train_fscore
|
|
value: [0.7254902 0.72463768 0.71 0.7254902 0.71568627 0.71641791
|
|
0.72195122 0.72636816 0.73 0.72277228]
|
|
|
|
mean value: 0.7218813914217747
|
|
|
|
key: test_precision
|
|
value: [0.54545455 0.55555556 0.75 0.66666667 0.5625 0.66666667
|
|
0.58333333 0.81818182 0.9 0.61538462]
|
|
|
|
mean value: 0.6663743201243202
|
|
|
|
key: train_precision
|
|
value: [0.67889908 0.66964286 0.67619048 0.67889908 0.66972477 0.6728972
|
|
0.66666667 0.68224299 0.68867925 0.67592593]
|
|
|
|
mean value: 0.6759768293904649
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.9 0.6 0.9 0.90909091
|
|
0.63636364 0.81818182 0.81818182 0.72727273]
|
|
|
|
mean value: 0.740909090909091
|
|
|
|
key: train_recall
|
|
value: [0.77894737 0.78947368 0.74736842 0.77894737 0.76842105 0.76595745
|
|
0.78723404 0.77659574 0.77659574 0.77659574]
|
|
|
|
mean value: 0.7746136618141097
|
|
|
|
key: test_roc_auc
|
|
value: [0.57272727 0.56818182 0.81363636 0.66363636 0.63181818 0.70454545
|
|
0.56818182 0.80909091 0.85909091 0.61363636]
|
|
|
|
mean value: 0.6804545454545454
|
|
|
|
key: train_roc_auc
|
|
value: [0.70330347 0.69792833 0.69283315 0.70330347 0.69272116 0.6987682
|
|
0.69888018 0.7093505 0.71461366 0.70408735]
|
|
|
|
mean value: 0.7015789473684211
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.35714286 0.69230769 0.46153846 0.52941176 0.625
|
|
0.4375 0.69230769 0.75 0.5 ]
|
|
|
|
mean value: 0.5445208468002586
|
|
|
|
key: train_jcc
|
|
value: [0.56923077 0.56818182 0.5503876 0.56923077 0.55725191 0.55813953
|
|
0.5648855 0.5703125 0.57480315 0.56589147]
|
|
|
|
mean value: 0.5648315015480971
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01307464 0.01691055 0.01696324 0.01449084 0.018785 0.01570868
|
|
0.01574159 0.01785326 0.01694202 0.01764059]
|
|
|
|
mean value: 0.01641104221343994
|
|
|
|
key: score_time
|
|
value: [0.0097959 0.0115664 0.0114677 0.01176572 0.01179075 0.01177979
|
|
0.01176739 0.01171255 0.01223779 0.01183796]
|
|
|
|
mean value: 0.01157219409942627
|
|
|
|
key: test_mcc
|
|
value: [0.43007562 0.46249729 0.71562645 0.80909091 0.36244122 0.66332496
|
|
0.13762047 0.67419986 0.80909091 0.60302269]
|
|
|
|
mean value: 0.5666990369115905
|
|
|
|
key: train_mcc
|
|
value: [0.88757469 0.63581076 0.96830553 0.76193012 0.76291765 0.67012598
|
|
0.88607273 0.71085804 0.94755736 0.91860433]
|
|
|
|
mean value: 0.8149757191296667
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.66666667 0.85714286 0.9047619 0.66666667 0.80952381
|
|
0.57142857 0.80952381 0.9047619 0.76190476]
|
|
|
|
mean value: 0.7666666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.94179894 0.78835979 0.98412698 0.87830688 0.86772487 0.80952381
|
|
0.94179894 0.83597884 0.97354497 0.95767196]
|
|
|
|
mean value: 0.8978835978835978
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.74074074 0.84210526 0.9 0.53333333 0.84615385
|
|
0.60869565 0.77777778 0.90909091 0.70588235]
|
|
|
|
mean value: 0.7530446542036258
|
|
|
|
key: train_fscore
|
|
value: [0.94472362 0.82608696 0.98429319 0.87150838 0.84848485 0.83928571
|
|
0.94358974 0.80254777 0.97297297 0.95555556]
|
|
|
|
mean value: 0.898904875380721
|
|
|
|
key: test_precision
|
|
value: [0.75 0.58823529 0.88888889 0.9 0.8 0.73333333
|
|
0.58333333 1. 0.90909091 1. ]
|
|
|
|
mean value: 0.8152881758764112
|
|
|
|
key: train_precision
|
|
value: [0.90384615 0.7037037 0.97916667 0.92857143 1. 0.72307692
|
|
0.91089109 1. 0.98901099 1. ]
|
|
|
|
mean value: 0.9138266953984776
|
|
|
|
key: test_recall
|
|
value: [0.6 1. 0.8 0.9 0.4 1.
|
|
0.63636364 0.63636364 0.90909091 0.54545455]
|
|
|
|
mean value: 0.7427272727272727
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.98947368 0.82105263 0.73684211 1.
|
|
0.9787234 0.67021277 0.95744681 0.91489362]
|
|
|
|
mean value: 0.9058118701007839
|
|
|
|
key: test_roc_auc
|
|
value: [0.70909091 0.68181818 0.85454545 0.90454545 0.65454545 0.8
|
|
0.56818182 0.81818182 0.90454545 0.77272727]
|
|
|
|
mean value: 0.7668181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [0.94154535 0.78723404 0.98409854 0.87861142 0.86842105 0.81052632
|
|
0.94199328 0.83510638 0.97346025 0.95744681]
|
|
|
|
mean value: 0.8978443449048152
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.58823529 0.72727273 0.81818182 0.36363636 0.73333333
|
|
0.4375 0.63636364 0.83333333 0.54545455]
|
|
|
|
mean value: 0.6183311051693404
|
|
|
|
key: train_jcc
|
|
value: [0.8952381 0.7037037 0.96907216 0.77227723 0.73684211 0.72307692
|
|
0.89320388 0.67021277 0.94736842 0.91489362]
|
|
|
|
mean value: 0.8225888907479606
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01661968 0.01596975 0.01542592 0.01607823 0.01489186 0.01536465
|
|
0.01502705 0.01687884 0.03487134 0.02883863]
|
|
|
|
mean value: 0.018996596336364746
|
|
|
|
key: score_time
|
|
value: [0.01196527 0.01202655 0.01178312 0.01207161 0.01201534 0.01186323
|
|
0.01198721 0.01221204 0.02217579 0.02880287]
|
|
|
|
mean value: 0.014690303802490234
|
|
|
|
key: test_mcc
|
|
value: [0.53935989 0. 0.74161985 0.90909091 0.53300179 0.50874702
|
|
0.03739788 0.90909091 0.50874702 0.26593594]
|
|
|
|
mean value: 0.49529912072758353
|
|
|
|
key: train_mcc
|
|
value: [0.94713854 0.47421554 0.89436546 0.87787601 0.51260702 0.64546146
|
|
0.84944554 0.83355494 0.38837405 0.37937244]
|
|
|
|
mean value: 0.6802411002611571
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.52380952 0.85714286 0.95238095 0.71428571 0.71428571
|
|
0.52380952 0.95238095 0.71428571 0.61904762]
|
|
|
|
mean value: 0.7333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.97354497 0.68253968 0.94708995 0.93650794 0.70899471 0.79365079
|
|
0.92063492 0.91005291 0.62962963 0.62433862]
|
|
|
|
mean value: 0.8126984126984127
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0. 0.82352941 0.95238095 0.76923077 0.78571429
|
|
0.58333333 0.95238095 0.78571429 0.71428571]
|
|
|
|
mean value: 0.7072452057746176
|
|
|
|
key: train_fscore
|
|
value: [0.97382199 0.53846154 0.94791667 0.94 0.7755102 0.82819383
|
|
0.92537313 0.9005848 0.72868217 0.72586873]
|
|
|
|
mean value: 0.8284413057399109
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0. 1. 0.90909091 0.625 0.64705882
|
|
0.53846154 1. 0.64705882 0.58823529]
|
|
|
|
mean value: 0.6812048245871776
|
|
|
|
key: train_precision
|
|
value: [0.96875 1. 0.93814433 0.8952381 0.63333333 0.70676692
|
|
0.86915888 1. 0.57317073 0.56969697]
|
|
|
|
mean value: 0.8154259255670528
|
|
|
|
key: test_recall
|
|
value: [0.6 0. 0.7 1. 1. 1.
|
|
0.63636364 0.90909091 1. 0.90909091]
|
|
|
|
mean value: 0.7754545454545454
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.36842105 0.95789474 0.98947368 1. 1.
|
|
0.9893617 0.81914894 1. 1. ]
|
|
|
|
mean value: 0.9103247480403136
|
|
|
|
key: test_roc_auc
|
|
value: [0.75454545 0.5 0.85 0.95454545 0.72727273 0.7
|
|
0.51818182 0.95454545 0.7 0.60454545]
|
|
|
|
mean value: 0.7263636363636363
|
|
|
|
key: train_roc_auc
|
|
value: [0.97351624 0.68421053 0.94703247 0.9362262 0.70744681 0.79473684
|
|
0.92099664 0.90957447 0.63157895 0.62631579]
|
|
|
|
mean value: 0.8131634938409854
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0. 0.7 0.90909091 0.625 0.64705882
|
|
0.41176471 0.90909091 0.64705882 0.55555556]
|
|
|
|
mean value: 0.5950074272133096
|
|
|
|
key: train_jcc
|
|
value: [0.94897959 0.36842105 0.9009901 0.88679245 0.63333333 0.70676692
|
|
0.86111111 0.81914894 0.57317073 0.56969697]
|
|
|
|
mean value: 0.726841119562058
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15451217 0.15445352 0.2040379 0.15504122 0.15569568 0.15498495
|
|
0.15341592 0.16030359 0.15658045 0.15535378]
|
|
|
|
mean value: 0.1604379177093506
|
|
|
|
key: score_time
|
|
value: [0.021106 0.02071619 0.02216125 0.0211637 0.02109528 0.021137
|
|
0.02115512 0.02139163 0.02109694 0.02112269]
|
|
|
|
mean value: 0.02121458053588867
|
|
|
|
key: test_mcc
|
|
value: [0.90829511 0.90829511 0.90829511 0.82275335 0.90829511 0.71562645
|
|
0.52295779 1. 0.90829511 0.90909091]
|
|
|
|
mean value: 0.8511904027211744
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95238095 0.95238095 0.95238095 0.9047619 0.95238095 0.85714286
|
|
0.76190476 1. 0.95238095 0.95238095]
|
|
|
|
mean value: 0.9238095238095237
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.94736842 0.88888889 0.94736842 0.86956522
|
|
0.7826087 1. 0.95652174 0.95238095]
|
|
|
|
mean value: 0.9239439177654281
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.83333333
|
|
0.75 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 0.9 0.9 0.8 0.9 0.90909091
|
|
0.81818182 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9036363636363637
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95 0.95 0.95 0.9 0.95 0.85454545
|
|
0.75909091 1. 0.95 0.95454545]
|
|
|
|
mean value: 0.9218181818181819
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.9 0.8 0.9 0.76923077
|
|
0.64285714 1. 0.91666667 0.90909091]
|
|
|
|
mean value: 0.8637845487845488
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0783937 0.06188798 0.0702672 0.065274 0.08460093 0.07100463
|
|
0.07129097 0.0830369 0.0656898 0.09316111]
|
|
|
|
mean value: 0.07446072101593018
|
|
|
|
key: score_time
|
|
value: [0.02479196 0.02323222 0.02332997 0.02291703 0.02248645 0.02907228
|
|
0.0261395 0.02889991 0.02425504 0.03242946]
|
|
|
|
mean value: 0.02575538158416748
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.82275335 1. 0.90829511 0.90829511 0.80909091
|
|
0.62641448 1. 1. 0.80909091]
|
|
|
|
mean value: 0.8706693216360863
|
|
|
|
key: train_mcc
|
|
value: [0.98947368 1. 0.98947368 0.98947368 0.97905701 1.
|
|
1. 0.98947251 0.97905701 0.96830907]
|
|
|
|
mean value: 0.9884316657513018
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.9047619 1. 0.95238095 0.95238095 0.9047619
|
|
0.80952381 1. 1. 0.9047619 ]
|
|
|
|
mean value: 0.9333333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.99470899 1. 0.99470899 0.99470899 0.98941799 1.
|
|
1. 0.99470899 0.98941799 0.98412698]
|
|
|
|
mean value: 0.9941798941798942
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 1. 0.94736842 0.94736842 0.90909091
|
|
0.83333333 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9324029771398192
|
|
|
|
key: train_fscore
|
|
value: [0.99470899 1. 0.99470899 0.99470899 0.9893617 1.
|
|
1. 0.99465241 0.98947368 0.98412698]
|
|
|
|
mean value: 0.9941741761009266
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.90909091
|
|
0.76923077 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9587412587412587
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97916667 0.97894737]
|
|
|
|
mean value: 0.995811403508772
|
|
|
|
key: test_recall
|
|
value: [0.8 0.8 1. 0.9 0.9 0.90909091
|
|
0.90909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.9127272727272727
|
|
|
|
key: train_recall
|
|
value: [0.98947368 1. 0.98947368 0.98947368 0.97894737 1.
|
|
1. 0.9893617 1. 0.9893617 ]
|
|
|
|
mean value: 0.9926091825307951
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.9 1. 0.95 0.95 0.90454545
|
|
0.80454545 1. 1. 0.90454545]
|
|
|
|
mean value: 0.9313636363636364
|
|
|
|
key: train_roc_auc
|
|
value: [0.99473684 1. 0.99473684 0.99473684 0.98947368 1.
|
|
1. 0.99468085 0.98947368 0.98415454]
|
|
|
|
mean value: 0.9941993281075028
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 1. 0.9 0.9 0.83333333
|
|
0.71428571 1. 1. 0.83333333]
|
|
|
|
mean value: 0.8780952380952382
|
|
|
|
key: train_jcc
|
|
value: [0.98947368 1. 0.98947368 0.98947368 0.97894737 1.
|
|
1. 0.9893617 0.97916667 0.96875 ]
|
|
|
|
mean value: 0.9884646789846958
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09878612 0.09960461 0.09258199 0.09417605 0.09641194 0.08949471
|
|
0.09223723 0.11594605 0.1179924 0.12251711]
|
|
|
|
mean value: 0.10197482109069825
|
|
|
|
key: score_time
|
|
value: [0.03881836 0.03061247 0.03737235 0.03541827 0.03377557 0.02945137
|
|
0.03179479 0.0367341 0.0444777 0.0131073 ]
|
|
|
|
mean value: 0.03315622806549072
|
|
|
|
key: test_mcc
|
|
value: [ 0.23373675 0.62641448 0.74161985 0.33636364 0.63305416 0.42727273
|
|
-0.03739788 0.82572282 0.4719399 0.67419986]
|
|
|
|
mean value: 0.49329263185333116
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61904762 0.80952381 0.85714286 0.66666667 0.80952381 0.71428571
|
|
0.47619048 0.9047619 0.71428571 0.80952381]
|
|
|
|
mean value: 0.7380952380952381
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.77777778 0.82352941 0.66666667 0.81818182 0.72727273
|
|
0.42105263 0.9 0.66666667 0.77777778]
|
|
|
|
mean value: 0.7134481033242643
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.625 0.875 1. 0.63636364 0.75 0.72727273
|
|
0.5 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.797077922077922
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.7 0.7 0.7 0.9 0.72727273
|
|
0.36363636 0.81818182 0.54545455 0.63636364]
|
|
|
|
mean value: 0.6590909090909091
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61363636 0.80454545 0.85 0.66818182 0.81363636 0.71363636
|
|
0.48181818 0.90909091 0.72272727 0.81818182]
|
|
|
|
mean value: 0.7395454545454545
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.63636364 0.7 0.5 0.69230769 0.57142857
|
|
0.26666667 0.81818182 0.5 0.63636364]
|
|
|
|
mean value: 0.5705927405927406
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.4725976 0.47249556 0.44347739 0.43946433 0.55229735 0.44706702
|
|
0.44039941 0.434376 0.51444864 0.48032069]
|
|
|
|
mean value: 0.4696943998336792
|
|
|
|
key: score_time
|
|
value: [0.01501417 0.01269388 0.0131402 0.0126729 0.01365566 0.01293302
|
|
0.01276636 0.01274252 0.01311779 0.0128448 ]
|
|
|
|
mean value: 0.013158130645751952
|
|
|
|
key: test_mcc
|
|
value: [0.82275335 0.90829511 1. 0.90829511 1. 1.
|
|
0.62641448 1. 1. 1. ]
|
|
|
|
mean value: 0.9265758046971604
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9047619 0.95238095 1. 0.95238095 1. 1.
|
|
0.80952381 1. 1. 1. ]
|
|
|
|
mean value: 0.9619047619047619
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.94736842 1. 0.94736842 1. 1.
|
|
0.83333333 1. 1. 1. ]
|
|
|
|
mean value: 0.9616959064327486
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.76923077 1. 1. 1. ]
|
|
|
|
mean value: 0.9769230769230769
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.9 1. 0.9 1. 1.
|
|
0.90909091 1. 1. 1. ]
|
|
|
|
mean value: 0.9509090909090909
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.95 1. 0.95 1. 1.
|
|
0.80454545 1. 1. 1. ]
|
|
|
|
mean value: 0.9604545454545454
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.9 1. 0.9 1. 1.
|
|
0.71428571 1. 1. 1. ]
|
|
|
|
mean value: 0.9314285714285715
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03971577 0.05372071 0.04902482 0.0534029 0.05441594 0.0602479
|
|
0.05954766 0.05021262 0.05034852 0.06813431]
|
|
|
|
mean value: 0.05387711524963379
|
|
|
|
key: score_time
|
|
value: [0.02256966 0.02043724 0.01937819 0.01983213 0.01934934 0.01934004
|
|
0.01987314 0.02205038 0.07356358 0.0410378 ]
|
|
|
|
mean value: 0.027743148803710937
|
|
|
|
key: test_mcc
|
|
value: [0.60302269 0.82572282 0.67419986 0.60302269 0.60302269 0.66332496
|
|
0.50874702 0.66332496 0.74161985 0.82275335]
|
|
|
|
mean value: 0.6708760888902932
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.76190476 0.9047619 0.80952381 0.76190476 0.76190476 0.80952381
|
|
0.71428571 0.80952381 0.85714286 0.9047619 ]
|
|
|
|
mean value: 0.8095238095238095
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.90909091 0.83333333 0.8 0.8 0.84615385
|
|
0.78571429 0.84615385 0.88 0.91666667]
|
|
|
|
mean value: 0.8417112887112888
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.83333333 0.71428571 0.66666667 0.66666667 0.73333333
|
|
0.64705882 0.73333333 0.78571429 0.84615385]
|
|
|
|
mean value: 0.7293212669683258
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.77272727 0.90909091 0.81818182 0.77272727 0.77272727 0.8
|
|
0.7 0.8 0.85 0.9 ]
|
|
|
|
mean value: 0.8095454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.83333333 0.71428571 0.66666667 0.66666667 0.73333333
|
|
0.64705882 0.73333333 0.78571429 0.84615385]
|
|
|
|
mean value: 0.7293212669683258
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03402305 0.05161929 0.07370734 0.047405 0.04301023 0.04708791
|
|
0.04369712 0.04251552 0.04322314 0.04266644]
|
|
|
|
mean value: 0.046895503997802734
|
|
|
|
key: score_time
|
|
value: [0.03400087 0.05133557 0.03652644 0.0336163 0.03723359 0.03171778
|
|
0.0283761 0.03356314 0.02800155 0.03366947]
|
|
|
|
mean value: 0.03480408191680908
|
|
|
|
key: test_mcc
|
|
value: [0.74161985 0.62641448 0.74161985 0.90909091 0.71818182 0.71818182
|
|
0.23636364 1. 0.80909091 0.82572282]
|
|
|
|
mean value: 0.732628609547866
|
|
|
|
key: train_mcc
|
|
value: [0.95767077 0.92597156 0.96830907 0.94714446 0.96830553 0.93672304
|
|
0.95767077 0.95767077 0.95788064 0.96830907]
|
|
|
|
mean value: 0.9545655686835185
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.80952381 0.85714286 0.95238095 0.85714286 0.85714286
|
|
0.61904762 1. 0.9047619 0.9047619 ]
|
|
|
|
mean value: 0.8619047619047618
|
|
|
|
key: train_accuracy
|
|
value: [0.97883598 0.96296296 0.98412698 0.97354497 0.98412698 0.96825397
|
|
0.97883598 0.97883598 0.97883598 0.98412698]
|
|
|
|
mean value: 0.9772486772486773
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.77777778 0.82352941 0.95238095 0.85714286 0.85714286
|
|
0.63636364 1. 0.90909091 0.9 ]
|
|
|
|
mean value: 0.8536957813428402
|
|
|
|
key: train_fscore
|
|
value: [0.97894737 0.96335079 0.98412698 0.97354497 0.98429319 0.96842105
|
|
0.9787234 0.9787234 0.97849462 0.98412698]
|
|
|
|
mean value: 0.9772752774075717
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 1. 0.90909091 0.81818182 0.9
|
|
0.63636364 1. 0.90909091 1. ]
|
|
|
|
mean value: 0.9047727272727273
|
|
|
|
key: train_precision
|
|
value: [0.97894737 0.95833333 0.9893617 0.9787234 0.97916667 0.95833333
|
|
0.9787234 0.9787234 0.98913043 0.97894737]
|
|
|
|
mean value: 0.9768390419851665
|
|
|
|
key: test_recall
|
|
value: [0.7 0.7 0.7 1. 0.9 0.81818182
|
|
0.63636364 1. 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8181818181818181
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.96842105 0.97894737 0.96842105 0.98947368 0.9787234
|
|
0.9787234 0.9787234 0.96808511 0.9893617 ]
|
|
|
|
mean value:/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:188: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:191: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
0.9777827547592385
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.80454545 0.85 0.95454545 0.85909091 0.85909091
|
|
0.61818182 1. 0.90454545 0.90909091]
|
|
|
|
mean value: 0.8609090909090908
|
|
|
|
key: train_roc_auc
|
|
value: [0.97883539 0.96293393 0.98415454 0.97357223 0.98409854 0.96830907
|
|
0.97883539 0.97883539 0.9787794 0.98415454]
|
|
|
|
mean value: 0.9772508398656214
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.63636364 0.7 0.90909091 0.75 0.75
|
|
0.46666667 1. 0.83333333 0.81818182]
|
|
|
|
mean value: 0.7563636363636363
|
|
|
|
key: train_jcc
|
|
value: [0.95876289 0.92929293 0.96875 0.94845361 0.96907216 0.93877551
|
|
0.95833333 0.95833333 0.95789474 0.96875 ]
|
|
|
|
mean value: 0.9556418502799597
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa',
|
|
'kd_values',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=166)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.38085413 0.38319826 0.41932321 0.38498354 0.38405871 0.38967252
|
|
0.35881495 0.39832497 0.41211724 0.37258697]
|
|
|
|
mean value: 0.3883934497833252
|
|
|
|
key: score_time
|
|
value: [0.03946233 0.02718139 0.03768969 0.03861022 0.03152013 0.03818846
|
|
0.03179121 0.0374887 0.02729797 0.03727818]
|
|
|
|
mean value: 0.0346508264541626
|
|
|
|
key: test_mcc
|
|
value: [0.74161985 0.74161985 0.66332496 0.90909091 0.71818182 0.80909091
|
|
0.23636364 1. 0.90829511 0.74795759]
|
|
|
|
mean value: 0.7475544626453499
|
|
|
|
key: train_mcc
|
|
value: [0.95767077 0.95767077 0.96830907 0.94714446 0.94714446 0.95767077
|
|
0.95767077 0.95767077 0.96830553 0.95767077]
|
|
|
|
mean value: 0.9576928147788344
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.85714286 0.80952381 0.95238095 0.85714286 0.9047619
|
|
0.61904762 1. 0.95238095 0.85714286]
|
|
|
|
mean value: 0.8666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.97883598 0.97883598 0.98412698 0.97354497 0.97354497 0.97883598
|
|
0.97883598 0.97883598 0.98412698 0.97883598]
|
|
|
|
mean value: 0.9788359788359788
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.82352941 0.75 0.95238095 0.85714286 0.90909091
|
|
0.63636364 1. 0.95652174 0.84210526]
|
|
|
|
mean value: 0.8550664180796096
|
|
|
|
key: train_fscore
|
|
value: [0.97894737 0.97894737 0.98412698 0.97354497 0.97354497 0.9787234
|
|
0.9787234 0.9787234 0.98395722 0.9787234 ]
|
|
|
|
mean value: 0.978796250433165
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 0.90909091 0.81818182 0.90909091
|
|
0.63636364 1. 0.91666667 1. ]
|
|
|
|
mean value: 0.918939393939394
|
|
|
|
key: train_precision
|
|
value: [0.97894737 0.97894737 0.9893617 0.9787234 0.9787234 0.9787234
|
|
0.9787234 0.9787234 0.98924731 0.9787234 ]
|
|
|
|
mean value: 0.9808844176329636
|
|
|
|
key: test_recall
|
|
value: [0.7 0.7 0.6 1. 0.9 0.90909091
|
|
0.63636364 1. 1. 0.72727273]
|
|
|
|
mean value: 0.8172727272727273
|
|
|
|
key: train_recall
|
|
value: [0.97894737 0.97894737 0.97894737 0.96842105 0.96842105 0.9787234
|
|
0.9787234 0.9787234 0.9787234 0.9787234 ]
|
|
|
|
mean value: 0.9767301231802912
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.85 0.8 0.95454545 0.85909091 0.90454545
|
|
0.61818182 1. 0.95 0.86363636]
|
|
|
|
mean value: 0.865
|
|
|
|
key: train_roc_auc
|
|
value: [0.97883539 0.97883539 0.98415454 0.97357223 0.97357223 0.97883539
|
|
0.97883539 0.97883539 0.98409854 0.97883539]
|
|
|
|
mean value: 0.9788409854423291
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.7 0.6 0.90909091 0.75 0.83333333
|
|
0.46666667 1. 0.91666667 0.72727273]
|
|
|
|
mean value: 0.7603030303030303
|
|
|
|
key: train_jcc
|
|
value: [0.95876289 0.95876289 0.96875 0.94845361 0.94845361 0.95833333
|
|
0.95833333 0.95833333 0.96842105 0.95833333]
|
|
|
|
mean value: 0.9584937375655634
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.8
|