18905 lines
917 KiB
Text
18905 lines
917 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data.py:550: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 531
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 531
|
|
ncols: 286
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 263
|
|
log10_or_mychisq 263
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
No. of numerical features: 44
|
|
No. of categorical features: 7
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: True
|
|
Original Data
|
|
Counter({0: 76, 1: 43}) Data dim: (119, 51)
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data: UQ [no aa_index but active site included] training
|
|
actual values: training set
|
|
imputed values: blind test set
|
|
Train data size: (119, 51)
|
|
Test data size: (412, 51)
|
|
y_train numbers: Counter({0: 76, 1: 43})
|
|
y_train ratio: 1.7674418604651163
|
|
|
|
y_test_numbers: Counter({0: 409, 1: 3})
|
|
y_test ratio: 136.33333333333334
|
|
-------------------------------------------------------------
|
|
Simple Random OverSampling
|
|
Counter({0: 76, 1: 76})
|
|
(152, 51)
|
|
Simple Random UnderSampling
|
|
Counter({0: 43, 1: 43})
|
|
(86, 51)
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 76, 1: 76})
|
|
(152, 51)
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 76, 1: 76})
|
|
(152, 51)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: UQ [without AA index but with active site annotations]
|
|
Gene name: gid
|
|
Drug name: streptomycin
|
|
|
|
Output directory: /home/tanu/git/Data/streptomycin/output/ml/uq_v1/
|
|
|
|
Sanity checks:
|
|
Total input features: 51
|
|
|
|
Training data size: (119, 51)
|
|
Test data size: (412, 51)
|
|
|
|
Target feature numbers (training data): Counter({0: 76, 1: 43})
|
|
Target features ratio (training data: 1.7674418604651163
|
|
|
|
Target feature numbers (test data): Counter({0: 409, 1: 3})
|
|
Target features ratio (test data): 136.33333333333334
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 35
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01348901 0.01226354 0.01228833 0.01419163 0.01196408 0.01235008
|
|
0.01226306 0.01198006 0.01196694 0.01303458]
|
|
|
|
mean value: 0.012579131126403808
|
|
|
|
key: score_time
|
|
value: [0.00877213 0.00875831 0.0089767 0.00837636 0.00834727 0.00831747
|
|
0.00833845 0.00829291 0.00832725 0.00867438]
|
|
|
|
mean value: 0.008518123626708984
|
|
|
|
key: test_mcc
|
|
value: [0.42640143 0.40824829 0. 0.625 0.63245553 0.70710678
|
|
0.68313005 0.83666003 0.31428571 0.62360956]
|
|
|
|
mean value: 0.5256897392741394
|
|
|
|
key: train_mcc
|
|
value: [0.73433335 0.80052092 0.81774488 0.71490799 0.77603911 0.73433335
|
|
0.75414636 0.75414636 0.79379397 0.7364483 ]
|
|
|
|
mean value: 0.7616414584886299
|
|
|
|
key: test_accuracy
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.75 0.75 0.5 0.83333333 0.83333333 0.83333333
|
|
0.83333333 0.91666667 0.66666667 0.81818182]
|
|
|
|
mean value: 0.7734848484848484
|
|
|
|
key: train_accuracy
|
|
value: [0.87850467 0.90654206 0.91588785 0.86915888 0.89719626 0.87850467
|
|
0.88785047 0.88785047 0.90654206 0.87962963]
|
|
|
|
mean value: 0.89076670128072
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.57142857 0.4 0.75 0.66666667 0.8
|
|
0.75 0.88888889 0.6 0.66666667]
|
|
|
|
mean value: 0.6493650793650794
|
|
|
|
key: train_fscore
|
|
value: [0.82191781 0.85714286 0.87671233 0.8 0.84931507 0.82191781
|
|
0.82352941 0.82352941 0.86111111 0.81690141]
|
|
|
|
mean value: 0.8352077213932715
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.33333333 0.75 1. 0.66666667
|
|
1. 1. 0.6 1. ]
|
|
|
|
mean value: 0.8016666666666666
|
|
|
|
key: train_precision
|
|
value: [0.88235294 0.96774194 0.94117647 0.90322581 0.91176471 0.88235294
|
|
0.93333333 0.93333333 0.91176471 0.90625 ]
|
|
|
|
mean value: 0.9173296173308033
|
|
|
|
key: test_recall
|
|
value: [0.25 0.5 0.5 0.75 0.5 1. 0.6 0.8 0.6 0.5 ]
|
|
|
|
mean value: 0.6
|
|
|
|
key: train_recall
|
|
value: [0.76923077 0.76923077 0.82051282 0.71794872 0.79487179 0.76923077
|
|
0.73684211 0.73684211 0.81578947 0.74358974]
|
|
|
|
mean value: 0.7674089068825911
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.6875 0.5 0.8125 0.75 0.875
|
|
0.8 0.9 0.65714286 0.75 ]
|
|
|
|
mean value: 0.7357142857142858
|
|
|
|
key: train_roc_auc
|
|
value: [0.85520362 0.87726244 0.89555053 0.83691554 0.87537707 0.85520362
|
|
0.8539283 0.8539283 0.88615561 0.85005574]
|
|
|
|
mean value: 0.8639580766297014
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.4 0.25 0.6 0.5 0.66666667
|
|
0.6 0.8 0.42857143 0.5 ]
|
|
|
|
mean value: 0.49952380952380954
|
|
|
|
key: train_jcc
|
|
value: [0.69767442 0.75 0.7804878 0.66666667 0.73809524 0.69767442
|
|
0.7 0.7 0.75609756 0.69047619]
|
|
|
|
mean value: 0.7177172298301056
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.38598299 0.37587595 0.37197232 0.36586213 0.3655436 0.3787601
|
|
0.37384486 0.3567059 0.36220002 0.34969902]
|
|
|
|
mean value: 0.3686446905136108
|
|
|
|
key: score_time
|
|
value: [0.00918126 0.00917006 0.00951552 0.00891447 0.00908375 0.00929761
|
|
0.00941563 0.00886798 0.00938153 0.00919795]
|
|
|
|
mean value: 0.00920257568359375
|
|
|
|
key: test_mcc
|
|
value: [1. 0.625 0.35355339 0.83666003 0.625 0.70710678
|
|
0.83666003 0.83666003 0.50709255 0.60714286]
|
|
|
|
mean value: 0.6934875661362015
|
|
|
|
key: train_mcc
|
|
value: [0.89876312 1. 0.9600061 0.85805669 0.95965309 0.95965309
|
|
0.93862091 0.85625561 1. 0.81859189]
|
|
|
|
mean value: 0.9249600511154796
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.66666667 0.91666667 0.83333333 0.83333333
|
|
0.91666667 0.91666667 0.75 0.81818182]
|
|
|
|
mean value: 0.8484848484848485
|
|
|
|
key: train_accuracy
|
|
value: [0.95327103 1. 0.98130841 0.93457944 0.98130841 0.98130841
|
|
0.97196262 0.93457944 1. 0.91666667]
|
|
|
|
mean value: 0.9654984423676012
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.6 0.88888889 0.75 0.8
|
|
0.88888889 0.88888889 0.72727273 0.75 ]
|
|
|
|
mean value: 0.8043939393939394
|
|
|
|
key: train_fscore
|
|
value: [0.93506494 1. 0.97368421 0.90666667 0.97435897 0.97435897
|
|
0.96 0.90410959 1. 0.87671233]
|
|
|
|
mean value: 0.9504955678784085
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.5 0.8 0.75 0.66666667
|
|
1. 1. 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7883333333333333
|
|
|
|
key: train_precision
|
|
value: [0.94736842 1. 1. 0.94444444 0.97435897 0.97435897
|
|
0.97297297 0.94285714 1. 0.94117647]
|
|
|
|
mean value: 0.9697537400633376
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 1. 0.75 1. 0.8 0.8 0.8 0.75]
|
|
|
|
mean value: 0.84
|
|
|
|
key: train_recall
|
|
value: [0.92307692 1. 0.94871795 0.87179487 0.97435897 0.97435897
|
|
0.94736842 0.86842105 1. 0.82051282]
|
|
|
|
mean value: 0.9328609986504723
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.6875 0.9375 0.8125 0.875
|
|
0.9 0.9 0.75714286 0.80357143]
|
|
|
|
mean value: 0.8485714285714285
|
|
|
|
key: train_roc_auc
|
|
value: [0.94683258 1. 0.97435897 0.92119155 0.97982655 0.97982655
|
|
0.96643783 0.91971777 1. 0.89576366]
|
|
|
|
mean value: 0.9583955462135567
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.42857143 0.8 0.6 0.66666667
|
|
0.8 0.8 0.57142857 0.6 ]
|
|
|
|
mean value: 0.6866666666666666
|
|
|
|
key: train_jcc
|
|
value: [0.87804878 1. 0.94871795 0.82926829 0.95 0.95
|
|
0.92307692 0.825 1. 0.7804878 ]
|
|
|
|
mean value: 0.9084599749843653
|
|
|
|
MCC on Blind test: 0.01
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00942707 0.00902557 0.00695324 0.00660396 0.00666595 0.00662398
|
|
0.00659394 0.00675702 0.00663805 0.00661373]
|
|
|
|
mean value: 0.007190251350402832
|
|
|
|
key: score_time
|
|
value: [0.01058674 0.01051068 0.00814915 0.00790286 0.00790191 0.0078671
|
|
0.00790071 0.00779772 0.00793719 0.0078752 ]
|
|
|
|
mean value: 0.008442926406860351
|
|
|
|
key: test_mcc
|
|
value: [0.81649658 0.47809144 0.5 0.23904572 0.35355339 0.47809144
|
|
0.16903085 0.50709255 0.16903085 0.35634832]
|
|
|
|
mean value: 0.4066781158133809
|
|
|
|
key: train_mcc
|
|
value: [0.63375685 0.67693504 0.66003337 0.51450646 0.70701192 0.58648859
|
|
0.69614472 0.60558322 0.65590587 0.6700827 ]
|
|
|
|
mean value: 0.6406448743407447
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.75 0.66666667 0.58333333 0.66666667 0.75
|
|
0.58333333 0.75 0.58333333 0.54545455]
|
|
|
|
mean value: 0.6795454545454546
|
|
|
|
key: train_accuracy
|
|
value: [0.80373832 0.8317757 0.8317757 0.71962617 0.85046729 0.81308411
|
|
0.8317757 0.82242991 0.80373832 0.83333333]
|
|
|
|
mean value: 0.8141744548286605
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.66666667 0.66666667 0.54545455 0.6 0.66666667
|
|
0.54545455 0.72727273 0.54545455 0.61538462]
|
|
|
|
mean value: 0.6436163836163835
|
|
|
|
key: train_fscore
|
|
value: [0.77419355 0.8 0.79069767 0.70588235 0.81818182 0.71428571
|
|
0.80434783 0.6984127 0.77894737 0.79545455]
|
|
|
|
mean value: 0.7680403546589664
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 0.5 0.42857143 0.5 0.6
|
|
0.5 0.66666667 0.5 0.44444444]
|
|
|
|
mean value: 0.5739682539682539
|
|
|
|
key: train_precision
|
|
value: [0.66666667 0.70588235 0.72340426 0.57142857 0.73469388 0.80645161
|
|
0.68518519 0.88 0.64912281 0.71428571]
|
|
|
|
mean value: 0.7137121043298253
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 0.75 0.75 0.75 0.6 0.8 0.6 1. ]
|
|
|
|
mean value: 0.775
|
|
|
|
key: train_recall
|
|
value: [0.92307692 0.92307692 0.87179487 0.92307692 0.92307692 0.64102564
|
|
0.97368421 0.57894737 0.97368421 0.8974359 ]
|
|
|
|
mean value: 0.8628879892037787
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.75 0.75 0.625 0.6875 0.75
|
|
0.58571429 0.75714286 0.58571429 0.64285714]
|
|
|
|
mean value: 0.7008928571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.82918552 0.85124434 0.8403092 0.76300905 0.86595023 0.77639517
|
|
0.8636537 0.76773455 0.84191457 0.84726867]
|
|
|
|
mean value: 0.8246665009957512
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.5 0.5 0.375 0.42857143 0.5
|
|
0.375 0.57142857 0.375 0.44444444]
|
|
|
|
mean value: 0.48194444444444445
|
|
|
|
key: train_jcc
|
|
value: [0.63157895 0.66666667 0.65384615 0.54545455 0.69230769 0.55555556
|
|
0.67272727 0.53658537 0.63793103 0.66037736]
|
|
|
|
mean value: 0.625303059275329
|
|
|
|
MCC on Blind test: 0.03
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00705171 0.00684381 0.00682545 0.00677752 0.0068419 0.0067687
|
|
0.00679421 0.00680947 0.0066936 0.00685263]
|
|
|
|
mean value: 0.0068259000778198246
|
|
|
|
key: score_time
|
|
value: [0.00836945 0.0079124 0.0078671 0.00792718 0.00797105 0.00790572
|
|
0.00787711 0.00786543 0.00802302 0.00789714]
|
|
|
|
mean value: 0.007961559295654296
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0.25 -0.23904572 0.47809144 0.40824829 0.
|
|
-0.09759001 0.52915026 0.31428571 0.38575837]
|
|
|
|
mean value: 0.20288983564397506
|
|
|
|
key: train_mcc
|
|
value: [0.4754902 0.50673892 0.4653488 0.44239297 0.48817818 0.50337256
|
|
0.39242808 0.39534618 0.48161946 0.37522992]
|
|
|
|
mean value: 0.4526145268371993
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.66666667 0.41666667 0.75 0.75 0.41666667
|
|
0.5 0.75 0.66666667 0.72727273]
|
|
|
|
mean value: 0.6310606060606061
|
|
|
|
key: train_accuracy
|
|
value: [0.75700935 0.77570093 0.75700935 0.74766355 0.76635514 0.77570093
|
|
0.72897196 0.71962617 0.76635514 0.72222222]
|
|
|
|
mean value: 0.7516614745586708
|
|
|
|
key: test_fscore
|
|
value: [0. 0.5 0.22222222 0.66666667 0.57142857 0.46153846
|
|
0.25 0.57142857 0.6 0.57142857]
|
|
|
|
mean value: 0.4414713064713065
|
|
|
|
key: train_fscore
|
|
value: [0.66666667 0.67567568 0.64864865 0.63013699 0.66666667 0.66666667
|
|
0.5915493 0.61538462 0.65753425 0.57142857]
|
|
|
|
mean value: 0.6390358039788872
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.2 0.6 0.66666667 0.33333333
|
|
0.33333333 1. 0.6 0.66666667]
|
|
|
|
mean value: 0.49
|
|
|
|
key: train_precision
|
|
value: [0.66666667 0.71428571 0.68571429 0.67647059 0.69444444 0.72727273
|
|
0.63636364 0.6 0.68571429 0.64516129]
|
|
|
|
mean value: 0.6732093639019635
|
|
|
|
key: test_recall
|
|
value: [0. 0.5 0.25 0.75 0.5 0.75 0.2 0.4 0.6 0.5 ]
|
|
|
|
mean value: 0.445
|
|
|
|
key: train_recall
|
|
value: [0.66666667 0.64102564 0.61538462 0.58974359 0.64102564 0.61538462
|
|
0.55263158 0.63157895 0.63157895 0.51282051]
|
|
|
|
mean value: 0.6097840755735493
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.625 0.375 0.75 0.6875 0.5
|
|
0.45714286 0.7 0.65714286 0.67857143]
|
|
|
|
mean value: 0.5930357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.7377451 0.74698341 0.72680995 0.71398944 0.73963047 0.74151584
|
|
0.68935927 0.69984744 0.73607933 0.67670011]
|
|
|
|
mean value: 0.7208660360817448
|
|
|
|
key: test_jcc
|
|
value: [0. 0.33333333 0.125 0.5 0.4 0.3
|
|
0.14285714 0.4 0.42857143 0.4 ]
|
|
|
|
mean value: 0.30297619047619045
|
|
|
|
key: train_jcc
|
|
value: [0.5 0.51020408 0.48 0.46 0.5 0.5
|
|
0.42 0.44444444 0.48979592 0.4 ]
|
|
|
|
mean value: 0.47044444444444444
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00671077 0.00906372 0.00673389 0.00643206 0.00652814 0.00724578
|
|
0.00731564 0.00711012 0.00714564 0.00714374]
|
|
|
|
mean value: 0.007142949104309082
|
|
|
|
key: score_time
|
|
value: [0.04456663 0.02610064 0.00889969 0.00866151 0.00879526 0.00940728
|
|
0.00941706 0.00944591 0.00944233 0.00941896]
|
|
|
|
mean value: 0.014415526390075683
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0. 0.47809144 0.625 0.15811388 0.47809144
|
|
0.07559289 0.29277002 0.11952286 -0.03857584]
|
|
|
|
mean value: 0.21886067104052556
|
|
|
|
key: train_mcc
|
|
value: [0.47836451 0.54358024 0.65128682 0.48080439 0.47687292 0.38417516
|
|
0.55925621 0.60298802 0.55802654 0.50141804]
|
|
|
|
mean value: 0.5236772842107089
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.58333333 0.75 0.83333333 0.66666667 0.75
|
|
0.58333333 0.66666667 0.58333333 0.54545455]
|
|
|
|
mean value: 0.6628787878787878
|
|
|
|
key: train_accuracy
|
|
value: [0.76635514 0.79439252 0.8411215 0.76635514 0.76635514 0.72897196
|
|
0.80373832 0.82242991 0.80373832 0.77777778]
|
|
|
|
mean value: 0.7871235721703012
|
|
|
|
key: test_fscore
|
|
value: [0. 0.28571429 0.66666667 0.75 0.33333333 0.66666667
|
|
0.28571429 0.5 0.44444444 0.28571429]
|
|
|
|
mean value: 0.4218253968253968
|
|
|
|
key: train_fscore
|
|
value: [0.63768116 0.66666667 0.75362319 0.64788732 0.62686567 0.53968254
|
|
0.69565217 0.70769231 0.67692308 0.64705882]
|
|
|
|
mean value: 0.6599732931818586
|
|
|
|
key: test_precision
|
|
value: [0. 0.33333333 0.6 0.75 0.5 0.6
|
|
0.5 0.66666667 0.5 0.33333333]
|
|
|
|
mean value: 0.47833333333333333
|
|
|
|
key: train_precision
|
|
value: [0.73333333 0.81481481 0.86666667 0.71875 0.75 0.70833333
|
|
0.77419355 0.85185185 0.81481481 0.75862069]
|
|
|
|
mean value: 0.7791379052857084
|
|
|
|
key: test_recall
|
|
value: [0. 0.25 0.75 0.75 0.25 0.75 0.2 0.4 0.4 0.25]
|
|
|
|
mean value: 0.4
|
|
|
|
key: train_recall
|
|
value: [0.56410256 0.56410256 0.66666667 0.58974359 0.53846154 0.43589744
|
|
0.63157895 0.60526316 0.57894737 0.56410256]
|
|
|
|
mean value: 0.5738866396761133
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.75 0.8125 0.5625 0.75
|
|
0.52857143 0.62857143 0.55714286 0.48214286]
|
|
|
|
mean value: 0.6071428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.72322775 0.74528658 0.80392157 0.72869532 0.71776018 0.66647813
|
|
0.76506484 0.77364607 0.7532418 0.73132664]
|
|
|
|
mean value: 0.7408648884655077
|
|
|
|
key: test_jcc
|
|
value: [0. 0.16666667 0.5 0.6 0.2 0.5
|
|
0.16666667 0.33333333 0.28571429 0.16666667]
|
|
|
|
mean value: 0.2919047619047619
|
|
|
|
key: train_jcc
|
|
value: [0.46808511 0.5 0.60465116 0.47916667 0.45652174 0.36956522
|
|
0.53333333 0.54761905 0.51162791 0.47826087]
|
|
|
|
mean value: 0.4948831049856425
|
|
|
|
MCC on Blind test: 0.04
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00767875 0.00732279 0.00758123 0.00753927 0.00771689 0.00793982
|
|
0.00819755 0.00749803 0.00751305 0.00764227]
|
|
|
|
mean value: 0.0076629638671875
|
|
|
|
key: score_time
|
|
value: [0.00804496 0.00836205 0.00849819 0.0081315 0.00821066 0.00882912
|
|
0.0087676 0.00863767 0.00823522 0.00829029]
|
|
|
|
mean value: 0.008400726318359374
|
|
|
|
key: test_mcc
|
|
value: [0.42640143 0.40824829 0.11952286 0.81649658 0.63245553 0.83666003
|
|
0.35675303 0.52915026 0.11952286 0.41833001]
|
|
|
|
mean value: 0.4663540894023734
|
|
|
|
key: train_mcc
|
|
value: [0.71777084 0.72240602 0.71777084 0.69776211 0.73774797 0.67769958
|
|
0.672375 0.71336904 0.78283392 0.67891024]
|
|
|
|
mean value: 0.7118645559720945
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.58333333 0.91666667 0.83333333 0.91666667
|
|
0.66666667 0.75 0.58333333 0.72727273]
|
|
|
|
mean value: 0.7477272727272727
|
|
|
|
key: train_accuracy
|
|
value: [0.86915888 0.86915888 0.86915888 0.85981308 0.87850467 0.85046729
|
|
0.85046729 0.86915888 0.89719626 0.85185185]
|
|
|
|
mean value: 0.8664935964001385
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.57142857 0.44444444 0.85714286 0.66666667 0.88888889
|
|
0.33333333 0.57142857 0.44444444 0.4 ]
|
|
|
|
mean value: 0.5577777777777778
|
|
|
|
key: train_fscore
|
|
value: [0.79411765 0.78787879 0.79411765 0.7761194 0.8115942 0.75757576
|
|
0.75 0.78787879 0.83076923 0.75757576]
|
|
|
|
mean value: 0.7847627221679594
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.4 1. 1. 0.8
|
|
1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.8366666666666667
|
|
|
|
key: train_precision
|
|
value: [0.93103448 0.96296296 0.93103448 0.92857143 0.93333333 0.92592593
|
|
0.92307692 0.92857143 1. 0.92592593]
|
|
|
|
mean value: 0.939043689388517
|
|
|
|
key: test_recall
|
|
value: [0.25 0.5 0.5 0.75 0.5 1. 0.2 0.4 0.4 0.25]
|
|
|
|
mean value: 0.475
|
|
|
|
key: train_recall
|
|
value: [0.69230769 0.66666667 0.69230769 0.66666667 0.71794872 0.64102564
|
|
0.63157895 0.68421053 0.71052632 0.64102564]
|
|
|
|
mean value: 0.6744264507422402
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.6875 0.5625 0.875 0.75 0.9375
|
|
0.6 0.7 0.55714286 0.625 ]
|
|
|
|
mean value: 0.6919642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.83144796 0.82598039 0.83144796 0.81862745 0.84426848 0.80580694
|
|
0.80129672 0.82761251 0.85526316 0.80602007]
|
|
|
|
mean value: 0.8247771639900459
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.4 0.28571429 0.75 0.5 0.8
|
|
0.2 0.4 0.28571429 0.25 ]
|
|
|
|
mean value: 0.41214285714285714
|
|
|
|
key: train_jcc
|
|
value: [0.65853659 0.65 0.65853659 0.63414634 0.68292683 0.6097561
|
|
0.6 0.65 0.71052632 0.6097561 ]
|
|
|
|
mean value: 0.6464184852374839
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.44463158 0.42616534 0.55910635 0.43400979 0.4541378 0.42967033
|
|
0.43932247 0.51193166 0.42841673 0.43269181]
|
|
|
|
mean value: 0.4560083866119385
|
|
|
|
key: score_time
|
|
value: [0.01107335 0.01112819 0.01112199 0.0153079 0.01128125 0.0111084
|
|
0.02174282 0.01112676 0.01113582 0.01421332]
|
|
|
|
mean value: 0.012923979759216308
|
|
|
|
key: test_mcc
|
|
value: [0.81649658 0.83666003 0. 0.70710678 0.15811388 0.47809144
|
|
0.07559289 0.47809144 0.31428571 0.38575837]
|
|
|
|
mean value: 0.42501971429170726
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.91666667 0.5 0.83333333 0.66666667 0.75
|
|
0.58333333 0.75 0.66666667 0.72727273]
|
|
|
|
mean value: 0.7310606060606061
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.88888889 0.4 0.8 0.33333333 0.66666667
|
|
0.28571429 0.66666667 0.6 0.57142857]
|
|
|
|
mean value: 0.606984126984127
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.33333333 0.66666667 0.5 0.6
|
|
0.5 0.75 0.6 0.66666667]
|
|
|
|
mean value: 0.6416666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 0.5 1. 0.25 0.75 0.2 0.6 0.6 0.5 ]
|
|
|
|
mean value: 0.615
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.9375 0.5 0.875 0.5625 0.75
|
|
0.52857143 0.72857143 0.65714286 0.67857143]
|
|
|
|
mean value: 0.7092857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.8 0.25 0.66666667 0.2 0.5
|
|
0.16666667 0.5 0.42857143 0.4 ]
|
|
|
|
mean value: 0.4661904761904762
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01969337 0.00755858 0.00811434 0.00742817 0.0073216 0.00782657
|
|
0.00776482 0.00792885 0.00732517 0.00794721]
|
|
|
|
mean value: 0.008890867233276367
|
|
|
|
key: score_time
|
|
value: [0.01085663 0.00857472 0.00874376 0.0083375 0.00824308 0.00869703
|
|
0.00888395 0.00872183 0.00871086 0.00867748]
|
|
|
|
mean value: 0.008844685554504395
|
|
|
|
key: test_mcc
|
|
value: [0.83666003 0.625 0.81649658 0.81649658 0.83666003 1.
|
|
0.50709255 0.84515425 0.65714286 0.81009259]
|
|
|
|
mean value: 0.7750795466933069
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.83333333 0.91666667 0.91666667 0.91666667 1.
|
|
0.75 0.91666667 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8909090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.75 0.85714286 0.85714286 0.88888889 1.
|
|
0.72727273 0.90909091 0.8 0.85714286]
|
|
|
|
mean value: 0.8535569985569985
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 1. 1. 0.8 1.
|
|
0.66666667 0.83333333 0.8 1. ]
|
|
|
|
mean value: 0.865
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 0.75 1. 1. 0.8 1. 0.8 0.75]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.875 0.875 0.9375 1.
|
|
0.75714286 0.92857143 0.82857143 0.875 ]
|
|
|
|
mean value: 0.8826785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.6 0.75 0.75 0.8 1.
|
|
0.57142857 0.83333333 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7521428571428571
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0870738 0.08674216 0.08633971 0.0796845 0.08702898 0.08723402
|
|
0.0800159 0.08267999 0.08336306 0.0805757 ]
|
|
|
|
mean value: 0.08407378196716309
|
|
|
|
key: score_time
|
|
value: [0.01838231 0.01821375 0.01814437 0.01772857 0.01825523 0.01835394
|
|
0.01846385 0.01686049 0.01691628 0.0185349 ]
|
|
|
|
mean value: 0.01798536777496338
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.40824829 0.625 1. 0.40824829 0.83666003
|
|
0.35675303 0.68313005 0.50709255 0. ]
|
|
|
|
mean value: 0.5457587777402898
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.83333333 0.75 0.83333333 1. 0.75 0.91666667
|
|
0.66666667 0.83333333 0.75 0.63636364]
|
|
|
|
mean value: 0.796969696969697
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.57142857 0.75 1. 0.57142857 0.88888889
|
|
0.33333333 0.75 0.72727273 0. ]
|
|
|
|
mean value: 0.625901875901876
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.75 1. 0.66666667 0.8
|
|
1. 1. 0.66666667 0. ]
|
|
|
|
mean value: 0.755
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.75 1. 0.5 1. 0.2 0.6 0.8 0. ]
|
|
|
|
mean value: 0.585
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.6875 0.8125 1. 0.6875 0.9375
|
|
0.6 0.8 0.75714286 0.5 ]
|
|
|
|
mean value: 0.7532142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.4 0.6 1. 0.4 0.8
|
|
0.2 0.6 0.57142857 0. ]
|
|
|
|
mean value: 0.5071428571428571
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00703287 0.00692582 0.00732732 0.00697088 0.00684643 0.00690222
|
|
0.00689054 0.00692654 0.00703335 0.00678849]
|
|
|
|
mean value: 0.006964445114135742
|
|
|
|
key: score_time
|
|
value: [0.00811267 0.00805044 0.00872636 0.00804448 0.00804257 0.00807214
|
|
0.0084269 0.00811172 0.00793386 0.00806904]
|
|
|
|
mean value: 0.008159017562866211
|
|
|
|
key: test_mcc
|
|
value: [ 0.63245553 0.63245553 0.25 0. 0.625 0.15811388
|
|
0.47809144 0.29277002 -0.23904572 0.38575837]
|
|
|
|
mean value: 0.3215599065732439
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.83333333 0.83333333 0.66666667 0.58333333 0.83333333 0.66666667
|
|
0.75 0.66666667 0.41666667 0.72727273]
|
|
|
|
mean value: 0.6977272727272728
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.5 0.28571429 0.75 0.33333333
|
|
0.66666667 0.5 0.22222222 0.57142857]
|
|
|
|
mean value: 0.5162698412698412
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.5 0.33333333 0.75 0.5
|
|
0.75 0.66666667 0.25 0.66666667]
|
|
|
|
mean value: 0.6416666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.5 0.25 0.75 0.25 0.6 0.4 0.2 0.5 ]
|
|
|
|
mean value: 0.445
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.75 0.625 0.5 0.8125 0.5625
|
|
0.72857143 0.62857143 0.38571429 0.67857143]
|
|
|
|
mean value: 0.6421428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.33333333 0.16666667 0.6 0.2
|
|
0.5 0.33333333 0.125 0.4 ]
|
|
|
|
mean value: 0.36583333333333334
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.9999063 0.96824765 0.96530747 1.01568055 1.00144243 1.0073843
|
|
1.0299902 0.9734323 0.96623063 0.96925235]
|
|
|
|
mean value: 0.9896874189376831
|
|
|
|
key: score_time
|
|
value: [0.08913732 0.08848977 0.09147906 0.0936265 0.09639764 0.09607625
|
|
0.08916879 0.08925462 0.08908725 0.08973861]
|
|
|
|
mean value: 0.09124557971954346
|
|
|
|
key: test_mcc
|
|
value: [1. 0.625 0.625 1. 0.40824829 1.
|
|
0.83666003 0.65714286 0.65714286 0.81009259]
|
|
|
|
mean value: 0.7619286618584635
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.83333333 1. 0.75 1.
|
|
0.91666667 0.83333333 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8909090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.75 1. 0.57142857 1.
|
|
0.88888889 0.8 0.8 0.85714286]
|
|
|
|
mean value: 0.8417460317460318
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.75 1. 0.66666667 1.
|
|
1. 0.8 0.8 1. ]
|
|
|
|
mean value: 0.8766666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 1. 0.5 1. 0.8 0.8 0.8 0.75]
|
|
|
|
mean value: 0.8150000000000001
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.8125 1. 0.6875 1.
|
|
0.9 0.82857143 0.82857143 0.875 ]
|
|
|
|
mean value: 0.8744642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.6 1. 0.4 1.
|
|
0.8 0.66666667 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7483333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
|
|
key: fit_time
|
|
value: [1.706635 0.86073232 0.898561 0.83530664 0.94017434 0.93120837
|
|
0.82625723 0.85147214 0.83630848 0.80121708]
|
|
|
|
mean value: 0.948787260055542
|
|
|
|
key: score_time
|
|
value: [0.23586416 0.21526957 0.2305944 0.21968794 0.23709798 0.14258313
|
|
0.17830396 0.22208166 0.23773837 0.23715353]
|
|
|
|
mean value: 0.215637469291687
|
|
|
|
key: test_mcc
|
|
value: [0.81649658 0.625 0.625 0.81649658 0.63245553 1.
|
|
0.52915026 0.47809144 0.68313005 0.81009259]
|
|
|
|
mean value: 0.701591303820076
|
|
|
|
key: train_mcc
|
|
value: [0.94025192 0.9600061 0.94025192 0.9600061 0.9600061 0.9600061
|
|
0.95952175 0.95952175 0.97968078 0.94053994]
|
|
|
|
mean value: 0.9559792483395588
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.83333333 0.83333333 0.91666667 0.83333333 1.
|
|
0.75 0.75 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8575757575757575
|
|
|
|
key: train_accuracy
|
|
value: [0.97196262 0.98130841 0.97196262 0.98130841 0.98130841 0.98130841
|
|
0.98130841 0.98130841 0.99065421 0.97222222]
|
|
|
|
mean value: 0.9794652128764278
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.75 0.75 0.85714286 0.66666667 1.
|
|
0.57142857 0.66666667 0.75 0.85714286]
|
|
|
|
mean value: 0.7726190476190475
|
|
|
|
key: train_fscore
|
|
value: [0.96 0.97368421 0.96 0.97368421 0.97368421 0.97368421
|
|
0.97297297 0.97297297 0.98666667 0.96 ]
|
|
|
|
mean value: 0.9707349454717876
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.75 1. 1. 1. 1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.925
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.75 0.75 0.5 1. 0.4 0.6 0.6 0.75]
|
|
|
|
mean value: 0.685
|
|
|
|
key: train_recall
|
|
value: [0.92307692 0.94871795 0.92307692 0.94871795 0.94871795 0.94871795
|
|
0.94736842 0.94736842 0.97368421 0.92307692]
|
|
|
|
mean value: 0.9432523616734143
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.8125 0.8125 0.875 0.75 1.
|
|
0.7 0.72857143 0.8 0.875 ]
|
|
|
|
mean value: 0.8228571428571428
|
|
|
|
key: train_roc_auc
|
|
value: [0.96153846 0.97435897 0.96153846 0.97435897 0.97435897 0.97435897
|
|
0.97368421 0.97368421 0.98684211 0.96153846]
|
|
|
|
mean value: 0.9716261808367072
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.6 0.6 0.75 0.5 1. 0.4 0.5 0.6 0.75]
|
|
|
|
mean value: 0.645
|
|
|
|
key: train_jcc
|
|
value: [0.92307692 0.94871795 0.92307692 0.94871795 0.94871795 0.94871795
|
|
0.94736842 0.94736842 0.97368421 0.92307692]
|
|
|
|
mean value: 0.9432523616734143
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01675677 0.00714898 0.0082562 0.0074234 0.00729084 0.00770926
|
|
0.00744081 0.00784683 0.00782228 0.00789189]
|
|
|
|
mean value: 0.00855872631072998
|
|
|
|
key: score_time
|
|
value: [0.01316333 0.00809813 0.0098176 0.00799656 0.00866699 0.0089457
|
|
0.00823903 0.00895739 0.00873828 0.0089376 ]
|
|
|
|
mean value: 0.009156060218811036
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0.25 -0.23904572 0.47809144 0.40824829 0.
|
|
-0.09759001 0.52915026 0.31428571 0.38575837]
|
|
|
|
mean value: 0.20288983564397506
|
|
|
|
key: train_mcc
|
|
value: [0.4754902 0.50673892 0.4653488 0.44239297 0.48817818 0.50337256
|
|
0.39242808 0.39534618 0.48161946 0.37522992]
|
|
|
|
mean value: 0.4526145268371993
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.66666667 0.41666667 0.75 0.75 0.41666667
|
|
0.5 0.75 0.66666667 0.72727273]
|
|
|
|
mean value: 0.6310606060606061
|
|
|
|
key: train_accuracy
|
|
value: [0.75700935 0.77570093 0.75700935 0.74766355 0.76635514 0.77570093
|
|
0.72897196 0.71962617 0.76635514 0.72222222]
|
|
|
|
mean value: 0.7516614745586708
|
|
|
|
key: test_fscore
|
|
value: [0. 0.5 0.22222222 0.66666667 0.57142857 0.46153846
|
|
0.25 0.57142857 0.6 0.57142857]
|
|
|
|
mean value: 0.4414713064713065
|
|
|
|
key: train_fscore
|
|
value: [0.66666667 0.67567568 0.64864865 0.63013699 0.66666667 0.66666667
|
|
0.5915493 0.61538462 0.65753425 0.57142857]
|
|
|
|
mean value: 0.6390358039788872
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.2 0.6 0.66666667 0.33333333
|
|
0.33333333 1. 0.6 0.66666667]
|
|
|
|
mean value: 0.49
|
|
|
|
key: train_precision
|
|
value: [0.66666667 0.71428571 0.68571429 0.67647059 0.69444444 0.72727273
|
|
0.63636364 0.6 0.68571429 0.64516129]
|
|
|
|
mean value: 0.6732093639019635
|
|
|
|
key: test_recall
|
|
value: [0. 0.5 0.25 0.75 0.5 0.75 0.2 0.4 0.6 0.5 ]
|
|
|
|
mean value: 0.445
|
|
|
|
key: train_recall
|
|
value: [0.66666667 0.64102564 0.61538462 0.58974359 0.64102564 0.61538462
|
|
0.55263158 0.63157895 0.63157895 0.51282051]
|
|
|
|
mean value: 0.6097840755735493
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.625 0.375 0.75 0.6875 0.5
|
|
0.45714286 0.7 0.65714286 0.67857143]
|
|
|
|
mean value: 0.5930357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.7377451 0.74698341 0.72680995 0.71398944 0.73963047 0.74151584
|
|
0.68935927 0.69984744 0.73607933 0.67670011]
|
|
|
|
mean value: 0.7208660360817448
|
|
|
|
key: test_jcc
|
|
value: [0. 0.33333333 0.125 0.5 0.4 0.3
|
|
0.14285714 0.4 0.42857143 0.4 ]
|
|
|
|
mean value: 0.30297619047619045
|
|
|
|
key: train_jcc
|
|
value: [0.5 0.51020408 0.48 0.46 0.5 0.5
|
|
0.42 0.44444444 0.48979592 0.4 ]
|
|
|
|
mean value: 0.47044444444444444
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.07072282 0.03617072 0.03579974 0.03728676 0.03555346 0.03524041
|
|
0.03362727 0.03646469 0.05544496 0.09033132]
|
|
|
|
mean value: 0.04666421413421631
|
|
|
|
key: score_time
|
|
value: [0.01112819 0.0112555 0.01102757 0.01083326 0.01092458 0.01118302
|
|
0.01061773 0.01085544 0.00988364 0.0103364 ]
|
|
|
|
mean value: 0.010804533958435059
|
|
|
|
key: test_mcc
|
|
value: [1. 0.625 0.81649658 0.81649658 0.83666003 1.
|
|
0.65714286 1. 0.65714286 0.81009259]
|
|
|
|
mean value: 0.8219031489976224
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.91666667 0.91666667 0.91666667 1.
|
|
0.83333333 1. 0.83333333 0.90909091]
|
|
|
|
mean value: 0.9159090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.85714286 0.85714286 0.88888889 1.
|
|
0.8 1. 0.8 0.85714286]
|
|
|
|
mean value: 0.881031746031746
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 1. 0.8 1. 0.8 1. 0.8 1. ]
|
|
|
|
mean value: 0.915
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 0.75 1. 1. 0.8 1. 0.8 0.75]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.875 0.875 0.9375 1.
|
|
0.82857143 1. 0.82857143 0.875 ]
|
|
|
|
mean value: 0.9032142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.75 0.75 0.8 1.
|
|
0.66666667 1. 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7983333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01759219 0.01122999 0.01159906 0.01160884 0.011446 0.0116322
|
|
0.01226497 0.01147413 0.01333094 0.01733923]
|
|
|
|
mean value: 0.01295175552368164
|
|
|
|
key: score_time
|
|
value: [0.01068926 0.01071072 0.01065826 0.01099992 0.01074338 0.01079488
|
|
0.01092148 0.01075888 0.01078367 0.01083136]
|
|
|
|
mean value: 0.010789179801940918
|
|
|
|
key: test_mcc
|
|
value: [ 0.625 0.25 0.35355339 0.83666003 0.83666003 0.83666003
|
|
0.65714286 1. 0.71428571 -0.17857143]
|
|
|
|
mean value: 0.5931390613052643
|
|
|
|
key: train_mcc
|
|
value: [0.90236159 0.96085507 0.96085507 0.90236159 0.93999796 0.92091277
|
|
0.92008523 0.92008523 0.92008523 0.96106604]
|
|
|
|
mean value: 0.9308665771065557
|
|
|
|
key: test_accuracy
|
|
value: [0.83333333 0.66666667 0.66666667 0.91666667 0.91666667 0.91666667
|
|
0.83333333 1. 0.83333333 0.45454545]
|
|
|
|
mean value: 0.8037878787878788
|
|
|
|
key: train_accuracy
|
|
value: [0.95327103 0.98130841 0.98130841 0.95327103 0.97196262 0.96261682
|
|
0.96261682 0.96261682 0.96261682 0.98148148]
|
|
|
|
mean value: 0.9673070266528211
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.5 0.6 0.88888889 0.88888889 0.88888889
|
|
0.8 1. 0.83333333 0.25 ]
|
|
|
|
mean value: 0.74
|
|
|
|
key: train_fscore
|
|
value: [0.9382716 0.975 0.975 0.9382716 0.96202532 0.95
|
|
0.94871795 0.94871795 0.94871795 0.975 ]
|
|
|
|
mean value: 0.9559722372486086
|
|
|
|
key: test_precision
|
|
value: [0.75 0.5 0.5 0.8 0.8 0.8
|
|
0.8 1. 0.71428571 0.25 ]
|
|
|
|
mean value: 0.6914285714285715
|
|
|
|
key: train_precision
|
|
value: [0.9047619 0.95121951 0.95121951 0.9047619 0.95 0.92682927
|
|
0.925 0.925 0.925 0.95121951]
|
|
|
|
mean value: 0.9315011614401858
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 0.75 1. 1. 1. 0.8 1. 1. 0.25]
|
|
|
|
mean value: 0.805
|
|
|
|
key: train_recall
|
|
value: [0.97435897 1. 1. 0.97435897 0.97435897 0.97435897
|
|
0.97368421 0.97368421 0.97368421 1. ]
|
|
|
|
mean value: 0.9818488529014845
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.625 0.6875 0.9375 0.9375 0.9375
|
|
0.82857143 1. 0.85714286 0.41071429]
|
|
|
|
mean value: 0.8033928571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.95776772 0.98529412 0.98529412 0.95776772 0.9724736 0.96512066
|
|
0.96510297 0.96510297 0.96510297 0.98550725]
|
|
|
|
mean value: 0.9704534119579886
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.33333333 0.42857143 0.8 0.8 0.8
|
|
0.66666667 1. 0.71428571 0.14285714]
|
|
|
|
mean value: 0.6285714285714286
|
|
|
|
key: train_jcc
|
|
value: [0.88372093 0.95121951 0.95121951 0.88372093 0.92682927 0.9047619
|
|
0.90243902 0.90243902 0.90243902 0.95121951]
|
|
|
|
mean value: 0.9160008643275801
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02494788 0.01585054 0.00776625 0.00726771 0.00712013 0.00683975
|
|
0.00691533 0.00710177 0.00688958 0.00703335]
|
|
|
|
mean value: 0.00977323055267334
|
|
|
|
key: score_time
|
|
value: [0.01840568 0.0093627 0.00875354 0.00817943 0.00809598 0.00807095
|
|
0.00832725 0.00807333 0.00806046 0.00866556]
|
|
|
|
mean value: 0.009399485588073731
|
|
|
|
key: test_mcc
|
|
value: [0.42640143 0.40824829 0.11952286 0.47809144 0.15811388 0.35355339
|
|
0.35675303 0.29277002 0.47809144 0.41833001]
|
|
|
|
mean value: 0.3489875814335667
|
|
|
|
key: train_mcc
|
|
value: [0.45416735 0.52159509 0.45416735 0.49964579 0.52383566 0.43117964
|
|
0.47315489 0.49023798 0.44470372 0.45631672]
|
|
|
|
mean value: 0.4749004183145091
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.58333333 0.75 0.66666667 0.66666667
|
|
0.66666667 0.66666667 0.75 0.72727273]
|
|
|
|
mean value: 0.6977272727272728
|
|
|
|
key: train_accuracy
|
|
value: [0.75700935 0.78504673 0.75700935 0.77570093 0.78504673 0.74766355
|
|
0.76635514 0.77570093 0.75700935 0.75925926]
|
|
|
|
mean value: 0.7665801315334025
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.57142857 0.44444444 0.66666667 0.33333333 0.6
|
|
0.33333333 0.5 0.66666667 0.4 ]
|
|
|
|
mean value: 0.4915873015873016
|
|
|
|
key: train_fscore
|
|
value: [0.60606061 0.66666667 0.60606061 0.625 0.63492063 0.58461538
|
|
0.63768116 0.625 0.59375 0.60606061]
|
|
|
|
mean value: 0.6185815663804795
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.4 0.6 0.5 0.5
|
|
1. 0.66666667 0.75 1. ]
|
|
|
|
mean value: 0.7083333333333334
|
|
|
|
key: train_precision
|
|
value: [0.74074074 0.76666667 0.74074074 0.8 0.83333333 0.73076923
|
|
0.70967742 0.76923077 0.73076923 0.74074074]
|
|
|
|
mean value: 0.7562668872346292
|
|
|
|
key: test_recall
|
|
value: [0.25 0.5 0.5 0.75 0.25 0.75 0.2 0.4 0.6 0.25]
|
|
|
|
mean value: 0.445
|
|
|
|
key: train_recall
|
|
value: [0.51282051 0.58974359 0.51282051 0.51282051 0.51282051 0.48717949
|
|
0.57894737 0.52631579 0.5 0.51282051]
|
|
|
|
mean value: 0.5246288798920378
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.6875 0.5625 0.75 0.5625 0.6875
|
|
0.6 0.62857143 0.72857143 0.625 ]
|
|
|
|
mean value: 0.6457142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.70493967 0.74340121 0.70493967 0.71964555 0.72699849 0.69211916
|
|
0.72425629 0.71967963 0.69927536 0.70568562]
|
|
|
|
mean value: 0.7140940648394546
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.4 0.28571429 0.5 0.2 0.42857143
|
|
0.2 0.33333333 0.5 0.25 ]
|
|
|
|
mean value: 0.33476190476190476
|
|
|
|
key: train_jcc
|
|
value: [0.43478261 0.5 0.43478261 0.45454545 0.46511628 0.41304348
|
|
0.46808511 0.45454545 0.42222222 0.43478261]
|
|
|
|
mean value: 0.44819058211137036
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00787282 0.00709176 0.00747252 0.00727296 0.00753498 0.0074842
|
|
0.00761414 0.00750947 0.00763249 0.00774121]
|
|
|
|
mean value: 0.00752265453338623
|
|
|
|
key: score_time
|
|
value: [0.00790644 0.00816011 0.00783634 0.00811148 0.00789452 0.00806546
|
|
0.00857043 0.00817347 0.00801802 0.00889587]
|
|
|
|
mean value: 0.008163213729858398
|
|
|
|
key: test_mcc
|
|
value: [1. 0.625 0.11952286 0.70710678 0.47809144 0.70710678
|
|
0.37142857 0.84515425 0.29277002 0.60714286]
|
|
|
|
mean value: 0.5753323572224797
|
|
|
|
key: train_mcc
|
|
value: [0.8165399 0.85945065 0.82420912 0.82726738 0.76153359 0.83287099
|
|
0.79235477 0.84830731 0.84110073 0.8789655 ]
|
|
|
|
mean value: 0.8282599941345357
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.58333333 0.83333333 0.75 0.83333333
|
|
0.66666667 0.91666667 0.66666667 0.81818182]
|
|
|
|
mean value: 0.7901515151515152
|
|
|
|
key: train_accuracy
|
|
value: [0.90654206 0.93457944 0.91588785 0.91588785 0.87850467 0.91588785
|
|
0.88785047 0.92523364 0.92523364 0.94444444]
|
|
|
|
mean value: 0.9150051921079958
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.44444444 0.8 0.66666667 0.8
|
|
0.66666667 0.90909091 0.5 0.75 ]
|
|
|
|
mean value: 0.7286868686868687
|
|
|
|
key: train_fscore
|
|
value: [0.88372093 0.90410959 0.86956522 0.89156627 0.85057471 0.89411765
|
|
0.86363636 0.90243902 0.88235294 0.92105263]
|
|
|
|
mean value: 0.8863135322209726
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.4 0.66666667 0.6 0.66666667
|
|
0.57142857 0.83333333 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: train_precision
|
|
value: [0.80851064 0.97058824 1. 0.84090909 0.77083333 0.82608696
|
|
0.76 0.84090909 1. 0.94594595]
|
|
|
|
mean value: 0.876378329121119
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.5 1. 0.75 1. 0.8 1. 0.4 0.75]
|
|
|
|
mean value: 0.795
|
|
|
|
key: train_recall
|
|
value: [0.97435897 0.84615385 0.76923077 0.94871795 0.94871795 0.97435897
|
|
1. 0.97368421 0.78947368 0.8974359 ]
|
|
|
|
mean value: 0.9122132253711202
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.5625 0.875 0.75 0.875
|
|
0.68571429 0.92857143 0.62857143 0.80357143]
|
|
|
|
mean value: 0.7921428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.92100302 0.91572398 0.88461538 0.92288839 0.89347662 0.92835596
|
|
0.91304348 0.93611747 0.89473684 0.9342252 ]
|
|
|
|
mean value: 0.9144186331459181
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.28571429 0.66666667 0.5 0.66666667
|
|
0.5 0.83333333 0.33333333 0.6 ]
|
|
|
|
mean value: 0.5985714285714285
|
|
|
|
key: train_jcc
|
|
value: [0.79166667 0.825 0.76923077 0.80434783 0.74 0.80851064
|
|
0.76 0.82222222 0.78947368 0.85365854]
|
|
|
|
mean value: 0.7964110343300379
|
|
|
|
MCC on Blind test: 0.04
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00992465 0.00924659 0.00704741 0.00705194 0.00760007 0.00777459
|
|
0.00699854 0.00773525 0.00748038 0.00777173]
|
|
|
|
mean value: 0.007863116264343262
|
|
|
|
key: score_time
|
|
value: [0.01014447 0.00912404 0.00799847 0.00816655 0.00813746 0.00799036
|
|
0.00796533 0.00817347 0.00830007 0.00831962]
|
|
|
|
mean value: 0.00843198299407959
|
|
|
|
key: test_mcc
|
|
value: [1. 0.40824829 0.40824829 0.625 0.81649658 0.625
|
|
0.83666003 0.65714286 0.23904572 0.41833001]
|
|
|
|
mean value: 0.6034171780666301
|
|
|
|
key: train_mcc
|
|
value: [0.8720951 0.89986237 0.74811148 0.77945561 0.86259524 0.6717753
|
|
0.93862091 0.88019137 0.69504805 0.78691217]
|
|
|
|
mean value: 0.8134667600062208
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.75 0.75 0.83333333 0.91666667 0.83333333
|
|
0.91666667 0.83333333 0.58333333 0.72727273]
|
|
|
|
mean value: 0.8143939393939394
|
|
|
|
key: train_accuracy
|
|
value: [0.93457944 0.95327103 0.87850467 0.89719626 0.93457944 0.8411215
|
|
0.97196262 0.94392523 0.82242991 0.89814815]
|
|
|
|
mean value: 0.9075718241606092
|
|
|
|
key: test_fscore
|
|
value: [1. 0.57142857 0.57142857 0.75 0.85714286 0.75
|
|
0.88888889 0.8 0.61538462 0.4 ]
|
|
|
|
mean value: 0.7204273504273505
|
|
|
|
key: train_fscore
|
|
value: [0.91764706 0.93670886 0.8 0.86075949 0.91358025 0.72131148
|
|
0.96 0.91428571 0.8 0.8358209 ]
|
|
|
|
mean value: 0.8660113745385427
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.66666667 0.75 1. 0.75
|
|
1. 0.8 0.5 1. ]
|
|
|
|
mean value: 0.8133333333333334
|
|
|
|
key: train_precision
|
|
value: [0.84782609 0.925 1. 0.85 0.88095238 1.
|
|
0.97297297 1. 0.66666667 1. ]
|
|
|
|
mean value: 0.9143418107548542
|
|
|
|
key: test_recall
|
|
value: [1. 0.5 0.5 0.75 0.75 0.75 0.8 0.8 0.8 0.25]
|
|
|
|
mean value: 0.6900000000000001
|
|
|
|
key: train_recall
|
|
value: [1. 0.94871795 0.66666667 0.87179487 0.94871795 0.56410256
|
|
0.94736842 0.84210526 1. 0.71794872]
|
|
|
|
mean value: 0.8507422402159244
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.6875 0.6875 0.8125 0.875 0.8125
|
|
0.9 0.82857143 0.61428571 0.625 ]
|
|
|
|
mean value: 0.7842857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.94852941 0.95230015 0.83333333 0.89177979 0.93759427 0.78205128
|
|
0.96643783 0.92105263 0.86231884 0.85897436]
|
|
|
|
mean value: 0.8954371900141855
|
|
|
|
key: test_jcc
|
|
value: [1. 0.4 0.4 0.6 0.75 0.6
|
|
0.8 0.66666667 0.44444444 0.25 ]
|
|
|
|
mean value: 0.5911111111111111
|
|
|
|
key: train_jcc
|
|
value: [0.84782609 0.88095238 0.66666667 0.75555556 0.84090909 0.56410256
|
|
0.92307692 0.84210526 0.66666667 0.71794872]
|
|
|
|
mean value: 0.7705809915992983
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07426667 0.06154084 0.06283951 0.06345463 0.06271338 0.06455684
|
|
0.06105089 0.06551576 0.06561875 0.06309128]
|
|
|
|
mean value: 0.0644648551940918
|
|
|
|
key: score_time
|
|
value: [0.01463723 0.01418447 0.01481771 0.01499844 0.01489115 0.01537299
|
|
0.01565957 0.01581383 0.01552248 0.01574159]
|
|
|
|
mean value: 0.015163946151733398
|
|
|
|
key: test_mcc
|
|
value: [1. 0.625 0.625 0.625 0.83666003 1.
|
|
0.52915026 1. 0.65714286 0.81009259]
|
|
|
|
mean value: 0.7708045733190834
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.83333333 0.83333333 0.91666667 1.
|
|
0.75 1. 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8909090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.75 0.75 0.88888889 1.
|
|
0.57142857 1. 0.8 0.85714286]
|
|
|
|
mean value: 0.8367460317460318
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.75 0.75 0.8 1. 1. 1. 0.8 1. ]
|
|
|
|
mean value: 0.885
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 0.75 1. 1. 0.4 1. 0.8 0.75]
|
|
|
|
mean value: 0.8200000000000001
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.8125 0.8125 0.9375 1.
|
|
0.7 1. 0.82857143 0.875 ]
|
|
|
|
mean value: 0.8778571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.6 0.6 0.8 1.
|
|
0.4 1. 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7416666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02740383 0.02774906 0.03296709 0.04285073 0.033988 0.02595377
|
|
0.04620218 0.03556585 0.02733755 0.03265309]
|
|
|
|
mean value: 0.03326711654663086
|
|
|
|
key: score_time
|
|
value: [0.02362061 0.02275753 0.03781056 0.03313112 0.02986407 0.02139044
|
|
0.03054595 0.02594328 0.02176881 0.02311182]
|
|
|
|
mean value: 0.02699441909790039
|
|
|
|
key: test_mcc
|
|
value: [0.83666003 0.625 0.81649658 1. 0.83666003 1.
|
|
0.83666003 1. 0.65714286 0.81009259]
|
|
|
|
mean value: 0.8418712104973792
|
|
|
|
key: train_mcc
|
|
value: [1. 0.97991726 1. 0.97991726 0.97991726 1.
|
|
1. 1. 1. 0.98002018]
|
|
|
|
mean value: 0.9919771953521386
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.83333333 0.91666667 1. 0.91666667 1.
|
|
0.91666667 1. 0.83333333 0.90909091]
|
|
|
|
mean value: 0.9242424242424242
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99065421 1. 0.99065421 0.99065421 1.
|
|
1. 1. 1. 0.99074074]
|
|
|
|
mean value: 0.9962703357563171
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.75 0.85714286 1. 0.88888889 1.
|
|
0.88888889 1. 0.8 0.85714286]
|
|
|
|
mean value: 0.8930952380952382
|
|
|
|
key: train_fscore
|
|
value: [1. 0.98701299 1. 0.98701299 0.98701299 1.
|
|
1. 1. 1. 0.98701299]
|
|
|
|
mean value: 0.9948051948051948
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 1. 1. 0.8 1. 1. 1. 0.8 1. ]
|
|
|
|
mean value: 0.915
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 1. 1. 1. 0.8 1. 0.8 0.75]
|
|
|
|
mean value: 0.885
|
|
|
|
key: train_recall
|
|
value: [1. 0.97435897 1. 0.97435897 0.97435897 1.
|
|
1. 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9897435897435898
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.875 1. 0.9375 1.
|
|
0.9 1. 0.82857143 0.875 ]
|
|
|
|
mean value: 0.9166071428571428
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.98717949 1. 0.98717949 0.98717949 1.
|
|
1. 1. 1. 0.98717949]
|
|
|
|
mean value: 0.9948717948717949
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.6 0.75 1. 0.8 1.
|
|
0.8 1. 0.66666667 0.75 ]
|
|
|
|
mean value: 0.8166666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 0.97435897 1. 0.97435897 0.97435897 1.
|
|
1. 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9897435897435898
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02972174 0.03583384 0.03609109 0.03561258 0.0359695 0.03587818
|
|
0.03609157 0.03597355 0.03254533 0.03658676]
|
|
|
|
mean value: 0.0350304126739502
|
|
|
|
key: score_time
|
|
value: [0.02094769 0.02031541 0.02000165 0.02015448 0.01970243 0.01941609
|
|
0.01103115 0.02209592 0.0211103 0.02550483]
|
|
|
|
mean value: 0.020027995109558105
|
|
|
|
key: test_mcc
|
|
value: [0.42640143 0.15811388 0.40824829 0.63245553 0.15811388 0.40824829
|
|
0.35675303 0.07559289 0.11952286 0. ]
|
|
|
|
mean value: 0.27434501012310836
|
|
|
|
key: train_mcc
|
|
value: [0.94025192 0.94025192 0.97991726 0.92064018 0.92064018 0.92064018
|
|
0.93950808 0.93950808 0.93950808 0.94053994]
|
|
|
|
mean value: 0.9381405840047681
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.66666667 0.75 0.83333333 0.66666667 0.75
|
|
0.66666667 0.58333333 0.58333333 0.63636364]
|
|
|
|
mean value: 0.6886363636363636
|
|
|
|
key: train_accuracy
|
|
value: [0.97196262 0.97196262 0.99065421 0.96261682 0.96261682 0.96261682
|
|
0.97196262 0.97196262 0.97196262 0.97222222]
|
|
|
|
mean value: 0.9710539979231568
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.33333333 0.57142857 0.66666667 0.33333333 0.57142857
|
|
0.33333333 0.28571429 0.44444444 0. ]
|
|
|
|
mean value: 0.39396825396825397
|
|
|
|
key: train_fscore
|
|
value: [0.96 0.96 0.98701299 0.94594595 0.94594595 0.94594595
|
|
0.95890411 0.95890411 0.95890411 0.96 ]
|
|
|
|
mean value: 0.9581563153617948
|
|
|
|
key: test_precision
|
|
value: [1. 0.5 0.66666667 1. 0.5 0.66666667
|
|
1. 0.5 0.5 0. ]
|
|
|
|
mean value: 0.6333333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.25 0.25 0.5 0.5 0.25 0.5 0.2 0.2 0.4 0. ]
|
|
|
|
mean value: 0.305
|
|
|
|
key: train_recall
|
|
value: [0.92307692 0.92307692 0.97435897 0.8974359 0.8974359 0.8974359
|
|
0.92105263 0.92105263 0.92105263 0.92307692]
|
|
|
|
mean value: 0.9199055330634278
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.5625 0.6875 0.75 0.5625 0.6875
|
|
0.6 0.52857143 0.55714286 0.5 ]
|
|
|
|
mean value: 0.6060714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.96153846 0.96153846 0.98717949 0.94871795 0.94871795 0.94871795
|
|
0.96052632 0.96052632 0.96052632 0.96153846]
|
|
|
|
mean value: 0.9599527665317139
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.2 0.4 0.5 0.2 0.4
|
|
0.2 0.16666667 0.28571429 0. ]
|
|
|
|
mean value: 0.26023809523809527
|
|
|
|
key: train_jcc
|
|
value: [0.92307692 0.92307692 0.97435897 0.8974359 0.8974359 0.8974359
|
|
0.92105263 0.92105263 0.92105263 0.92307692]
|
|
|
|
mean value: 0.9199055330634278
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08681178 0.08099294 0.0833962 0.08435941 0.08145213 0.08759737
|
|
0.08979297 0.08592725 0.08576846 0.07664156]
|
|
|
|
mean value: 0.08427400588989258
|
|
|
|
key: score_time
|
|
value: [0.00886655 0.00919652 0.00933671 0.00866079 0.00926757 0.00908661
|
|
0.00926757 0.0088954 0.00944066 0.0094223 ]
|
|
|
|
mean value: 0.009144067764282227
|
|
|
|
key: test_mcc
|
|
value: [0.83666003 0.625 0.81649658 0.81649658 0.83666003 0.83666003
|
|
0.65714286 0.84515425 0.65714286 0.81009259]
|
|
|
|
mean value: 0.7737505797772892
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.83333333 0.91666667 0.91666667 0.91666667 0.91666667
|
|
0.83333333 0.91666667 0.83333333 0.90909091]
|
|
|
|
mean value: 0.8909090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.75 0.85714286 0.85714286 0.88888889 0.88888889
|
|
0.8 0.90909091 0.8 0.85714286]
|
|
|
|
mean value: 0.8497186147186148
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 1. 1. 0.8 0.8
|
|
0.8 0.83333333 0.8 1. ]
|
|
|
|
mean value: 0.8583333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.75 0.75 1. 1. 0.8 1. 0.8 0.75]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.875 0.875 0.9375 0.9375
|
|
0.82857143 0.92857143 0.82857143 0.875 ]
|
|
|
|
mean value: 0.8835714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.6 0.75 0.75 0.8 0.8
|
|
0.66666667 0.83333333 0.66666667 0.75 ]
|
|
|
|
mean value: 0.7416666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00962162 0.01063228 0.01091313 0.01079535 0.01196051 0.0170722
|
|
0.01151824 0.01096225 0.02570939 0.01203942]
|
|
|
|
mean value: 0.01312243938446045
|
|
|
|
key: score_time
|
|
value: [0.01107264 0.01098132 0.01062155 0.01118159 0.01129007 0.01117349
|
|
0.01122856 0.01094556 0.01141524 0.01125026]
|
|
|
|
mean value: 0.01111602783203125
|
|
|
|
key: test_mcc
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0. 0.07559289 0. 0. ]
|
|
|
|
mean value: 0.007559289460184544
|
|
|
|
key: train_mcc
|
|
value: [0.32183783 0.32183783 0.32183783 0.18223949 0.26021572 0.26021572
|
|
0.32843368 0.32843368 0.29834424 0.29306141]
|
|
|
|
mean value: 0.2916457442021758
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.66666667 0.66666667 0.66666667 0.66666667 0.66666667
|
|
0.58333333 0.58333333 0.58333333 0.63636364]
|
|
|
|
mean value: 0.6386363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.69158879 0.69158879 0.69158879 0.65420561 0.6728972 0.6728972
|
|
0.70093458 0.70093458 0.69158879 0.68518519]
|
|
|
|
mean value: 0.6853409484250605
|
|
|
|
key: test_fscore
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0. 0.28571429 0. 0. ]
|
|
|
|
mean value: 0.028571428571428574
|
|
|
|
key: train_fscore
|
|
value: [0.26666667 0.26666667 0.26666667 0.09756098 0.18604651 0.18604651
|
|
0.27272727 0.27272727 0.23255814 0.22727273]
|
|
|
|
mean value: 0.22749394111277266
|
|
|
|
key: test_precision
|
|
value: [0. 0. 0. 0. 0. 0. 0. 0.5 0. 0. ]
|
|
|
|
mean value: 0.05
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0. 0. 0. 0. 0. 0. 0. 0.2 0. 0. ]
|
|
|
|
mean value: 0.02
|
|
|
|
key: train_recall
|
|
value: [0.15384615 0.15384615 0.15384615 0.05128205 0.1025641 0.1025641
|
|
0.15789474 0.15789474 0.13157895 0.12820513]
|
|
|
|
mean value: 0.12935222672064778
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.5 0.5 0.5 0.5
|
|
0.5 0.52857143 0.5 0.5 ]
|
|
|
|
mean value: 0.5028571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.57692308 0.57692308 0.57692308 0.52564103 0.55128205 0.55128205
|
|
0.57894737 0.57894737 0.56578947 0.56410256]
|
|
|
|
mean value: 0.5646761133603239
|
|
|
|
key: test_jcc
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0. 0.16666667 0. 0. ]
|
|
|
|
mean value: 0.016666666666666666
|
|
|
|
key: train_jcc
|
|
value: [0.15384615 0.15384615 0.15384615 0.05128205 0.1025641 0.1025641
|
|
0.15789474 0.15789474 0.13157895 0.12820513]
|
|
|
|
mean value: 0.12935222672064778
|
|
|
|
MCC on Blind test: -0.02
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0105691 0.01015902 0.00814056 0.00781894 0.00773811 0.00766277
|
|
0.00809789 0.0085144 0.00822377 0.008322 ]
|
|
|
|
mean value: 0.008524656295776367
|
|
|
|
key: score_time
|
|
value: [0.01082993 0.00936484 0.00863528 0.00822663 0.00833321 0.00819302
|
|
0.0086484 0.00855327 0.00863814 0.00825906]
|
|
|
|
mean value: 0.008768177032470703
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.40824829 0.35355339 1. 0.625 0.70710678
|
|
0.68313005 0.83666003 0.31428571 0.69006556]
|
|
|
|
mean value: 0.6250505345503478
|
|
|
|
key: train_mcc
|
|
value: [0.79826546 0.8375252 0.89876312 0.85818605 0.85972678 0.87895928
|
|
0.81760898 0.83676583 0.89756105 0.83946488]
|
|
|
|
mean value: 0.8522826622791292
|
|
|
|
key: test_accuracy
|
|
value: [0.83333333 0.75 0.66666667 1. 0.83333333 0.83333333
|
|
0.83333333 0.91666667 0.66666667 0.81818182]
|
|
|
|
mean value: 0.8151515151515152
|
|
|
|
key: train_accuracy
|
|
value: [0.90654206 0.92523364 0.95327103 0.93457944 0.93457944 0.94392523
|
|
0.91588785 0.92523364 0.95327103 0.92592593]
|
|
|
|
mean value: 0.9318449290411908
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.57142857 0.6 1. 0.75 0.8
|
|
0.75 0.88888889 0.6 0.8 ]
|
|
|
|
mean value: 0.7426984126984127
|
|
|
|
key: train_fscore
|
|
value: [0.87179487 0.89473684 0.93506494 0.90909091 0.91139241 0.92307692
|
|
0.88311688 0.89473684 0.93333333 0.8974359 ]
|
|
|
|
mean value: 0.905377984218757
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.5 1. 0.75 0.66666667
|
|
1. 1. 0.6 0.66666667]
|
|
|
|
mean value: 0.785
|
|
|
|
key: train_precision
|
|
value: [0.87179487 0.91891892 0.94736842 0.92105263 0.9 0.92307692
|
|
0.87179487 0.89473684 0.94594595 0.8974359 ]
|
|
|
|
mean value: 0.9092125323704271
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.75 1. 0.75 1. 0.6 0.8 0.6 1. ]
|
|
|
|
mean value: 0.75
|
|
|
|
key: train_recall
|
|
value: [0.87179487 0.87179487 0.92307692 0.8974359 0.92307692 0.92307692
|
|
0.89473684 0.89473684 0.92105263 0.8974359 ]
|
|
|
|
mean value: 0.9018218623481782
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.6875 0.6875 1. 0.8125 0.875
|
|
0.8 0.9 0.65714286 0.85714286]
|
|
|
|
mean value: 0.8026785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.89913273 0.91383861 0.94683258 0.92665913 0.9321267 0.93947964
|
|
0.91113654 0.91838291 0.94603356 0.91973244]
|
|
|
|
mean value: 0.9253354836037566
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.4 0.42857143 1. 0.6 0.66666667
|
|
0.6 0.8 0.42857143 0.66666667]
|
|
|
|
mean value: 0.6090476190476191
|
|
|
|
key: train_jcc
|
|
value: [0.77272727 0.80952381 0.87804878 0.83333333 0.8372093 0.85714286
|
|
0.79069767 0.80952381 0.875 0.81395349]
|
|
|
|
mean value: 0.8277160327855166
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.07413816 0.06058931 0.06105781 0.06061363 0.0620904 0.06173635
|
|
0.06109071 0.06248355 0.06057048 0.06052804]
|
|
|
|
mean value: 0.06248984336853027
|
|
|
|
key: score_time
|
|
value: [0.00838947 0.00880098 0.00829411 0.0082829 0.00848746 0.00843048
|
|
0.00848484 0.00820589 0.00828862 0.00822687]
|
|
|
|
mean value: 0.008389163017272949
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.40824829 0.35355339 1. 0.625 0.70710678
|
|
0.68313005 0.83666003 0.31428571 0.69006556]
|
|
|
|
mean value: 0.6250505345503478
|
|
|
|
key: train_mcc
|
|
value: [0.79826546 0.8375252 0.89876312 0.85818605 0.85972678 0.87895928
|
|
0.81760898 0.83676583 0.89756105 0.83946488]
|
|
|
|
mean value: 0.8522826622791292
|
|
|
|
key: test_accuracy
|
|
value: [0.83333333 0.75 0.66666667 1. 0.83333333 0.83333333
|
|
0.83333333 0.91666667 0.66666667 0.81818182]
|
|
|
|
mean value: 0.8151515151515152
|
|
|
|
key: train_accuracy
|
|
value: [0.90654206 0.92523364 0.95327103 0.93457944 0.93457944 0.94392523
|
|
0.91588785 0.92523364 0.95327103 0.92592593]
|
|
|
|
mean value: 0.9318449290411908
|
|
|
|
key: test_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:122: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:125: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.66666667 0.57142857 0.6 1. 0.75 0.8
|
|
0.75 0.88888889 0.6 0.8 ]
|
|
|
|
mean value: 0.7426984126984127
|
|
|
|
key: train_fscore
|
|
value: [0.87179487 0.89473684 0.93506494 0.90909091 0.91139241 0.92307692
|
|
0.88311688 0.89473684 0.93333333 0.8974359 ]
|
|
|
|
mean value: 0.905377984218757
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.5 1. 0.75 0.66666667
|
|
1. 1. 0.6 0.66666667]
|
|
|
|
mean value: 0.785
|
|
|
|
key: train_precision
|
|
value: [0.87179487 0.91891892 0.94736842 0.92105263 0.9 0.92307692
|
|
0.87179487 0.89473684 0.94594595 0.8974359 ]
|
|
|
|
mean value: 0.9092125323704271
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.75 1. 0.75 1. 0.6 0.8 0.6 1. ]
|
|
|
|
mean value: 0.75
|
|
|
|
key: train_recall
|
|
value: [0.87179487 0.87179487 0.92307692 0.8974359 0.92307692 0.92307692
|
|
0.89473684 0.89473684 0.92105263 0.8974359 ]
|
|
|
|
mean value: 0.9018218623481782
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.6875 0.6875 1. 0.8125 0.875
|
|
0.8 0.9 0.65714286 0.85714286]
|
|
|
|
mean value: 0.8026785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.89913273 0.91383861 0.94683258 0.92665913 0.9321267 0.93947964
|
|
0.91113654 0.91838291 0.94603356 0.91973244]
|
|
|
|
mean value: 0.9253354836037566
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.4 0.42857143 1. 0.6 0.66666667
|
|
0.6 0.8 0.42857143 0.66666667]
|
|
|
|
mean value: 0.6090476190476191
|
|
|
|
key: train_jcc
|
|
value: [0.77272727 0.80952381 0.87804878 0.83333333 0.8372093 0.85714286
|
|
0.79069767 0.80952381 0.875 0.81395349]
|
|
|
|
mean value: 0.8277160327855166
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01721478 0.01214457 0.01244617 0.01377439 0.01370144 0.01362348
|
|
0.01294899 0.01230645 0.01298475 0.0133431 ]
|
|
|
|
mean value: 0.013448810577392578
|
|
|
|
key: score_time
|
|
value: [0.01065063 0.00836754 0.00845647 0.00828004 0.00818396 0.00835776
|
|
0.00841212 0.0085628 0.00873351 0.00875807]
|
|
|
|
mean value: 0.008676290512084961
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.5 0.37796447 0.875 1. 0.60714286
|
|
0.76376262 1. 0.64465837 0.60714286]
|
|
|
|
mean value: 0.7257588278029415
|
|
|
|
key: train_mcc
|
|
value: [0.79411765 0.85331034 0.79599234 0.76678748 0.81031543 0.82480818
|
|
0.81031543 0.82480818 0.79688349 0.85400682]
|
|
|
|
mean value: 0.8131345350406455
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.75 0.66666667 0.93333333 1. 0.8
|
|
0.86666667 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.8554166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.89705882 0.92647059 0.89781022 0.88321168 0.90510949 0.91240876
|
|
0.90510949 0.91240876 0.89781022 0.9270073 ]
|
|
|
|
mean value: 0.9064405324173466
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.75 0.70588235 0.93333333 1. 0.8
|
|
0.85714286 1. 0.84210526 0.8 ]
|
|
|
|
mean value: 0.8621797139908595
|
|
|
|
key: train_fscore
|
|
value: [0.89705882 0.92537313 0.89705882 0.88235294 0.90510949 0.91304348
|
|
0.90510949 0.91176471 0.89393939 0.92647059]
|
|
|
|
mean value: 0.9057280866983752
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.6 0.875 1. 0.75
|
|
1. 1. 0.72727273 0.85714286]
|
|
|
|
mean value: 0.8559415584415584
|
|
|
|
key: train_precision
|
|
value: [0.89705882 0.93939394 0.91044776 0.89552239 0.91176471 0.91304348
|
|
0.89855072 0.91176471 0.921875 0.92647059]
|
|
|
|
mean value: 0.9125892115075633
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 1. 1. 0.85714286
|
|
0.75 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8839285714285714
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.91176471 0.88405797 0.86956522 0.89855072 0.91304348
|
|
0.91176471 0.91176471 0.86764706 0.92647059]
|
|
|
|
mean value: 0.8991687979539642
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.75 0.67857143 0.9375 1. 0.80357143
|
|
0.875 1. 0.78571429 0.80357143]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.89705882 0.92647059 0.89791134 0.88331202 0.90515772 0.91240409
|
|
0.90515772 0.91240409 0.89759165 0.92700341]
|
|
|
|
mean value: 0.9064471440750212
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.6 0.54545455 0.875 1. 0.66666667
|
|
0.75 1. 0.72727273 0.66666667]
|
|
|
|
mean value: 0.7706060606060606
|
|
|
|
key: train_jcc
|
|
value: [0.81333333 0.86111111 0.81333333 0.78947368 0.82666667 0.84
|
|
0.82666667 0.83783784 0.80821918 0.8630137 ]
|
|
|
|
mean value: 0.8279655509871804
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.39265418 0.37149286 0.37123227 0.37943006 0.37867284 0.38540936
|
|
0.36591649 0.36693406 0.37429166 0.37283516]
|
|
|
|
mean value: 0.3758868932723999
|
|
|
|
key: score_time
|
|
value: [0.00942802 0.0091536 0.0086658 0.00927925 0.00937963 0.00886846
|
|
0.00882983 0.00927758 0.009166 0.00868988]
|
|
|
|
mean value: 0.009073805809020997
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.49099025 0.76376262 0.73214286 0.60714286
|
|
0.6000992 1. 0.64465837 0.33928571]
|
|
|
|
mean value: 0.6689939758834577
|
|
|
|
key: train_mcc
|
|
value: [0.91215932 0.92657079 0.94201665 0.8978896 0.97080136 0.97122151
|
|
0.94201665 0.88320546 1. 0.95630861]
|
|
|
|
mean value: 0.9402189943658086
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.73333333 0.86666667 0.86666667 0.8
|
|
0.8 1. 0.8 0.66666667]
|
|
|
|
mean value: 0.8283333333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.95588235 0.96323529 0.97080292 0.94890511 0.98540146 0.98540146
|
|
0.97080292 0.94160584 1. 0.97810219]
|
|
|
|
mean value: 0.9700139544869043
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.75 0.875 0.85714286 0.8
|
|
0.82352941 1. 0.84210526 0.66666667]
|
|
|
|
mean value: 0.8371306943830163
|
|
|
|
key: train_fscore
|
|
value: [0.95652174 0.96350365 0.97058824 0.94964029 0.98550725 0.98529412
|
|
0.97101449 0.94117647 1. 0.97810219]
|
|
|
|
mean value: 0.9701348428976124
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.66666667 0.77777778 0.85714286 0.75
|
|
0.77777778 1. 0.72727273 0.71428571]
|
|
|
|
mean value: 0.8048701298701298
|
|
|
|
key: train_precision
|
|
value: [0.94285714 0.95652174 0.98507463 0.94285714 0.98550725 1.
|
|
0.95714286 0.94117647 1. 0.97101449]
|
|
|
|
mean value: 0.968215171857192
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 1. 0.85714286 0.85714286
|
|
0.875 1. 1. 0.625 ]
|
|
|
|
mean value: 0.8821428571428571
|
|
|
|
key: train_recall
|
|
value: [0.97058824 0.97058824 0.95652174 0.95652174 0.98550725 0.97101449
|
|
0.98529412 0.94117647 1. 0.98529412]
|
|
|
|
mean value: 0.9722506393861893
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.74107143 0.875 0.86607143 0.80357143
|
|
0.79464286 1. 0.78571429 0.66964286]
|
|
|
|
mean value: 0.8285714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.95588235 0.96323529 0.97090793 0.9488491 0.98540068 0.98550725
|
|
0.97090793 0.94160273 1. 0.97815431]
|
|
|
|
mean value: 0.9700447570332481
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.6 0.77777778 0.75 0.66666667
|
|
0.7 1. 0.72727273 0.5 ]
|
|
|
|
mean value: 0.7296717171717172
|
|
|
|
key: train_jcc
|
|
value: [0.91666667 0.92957746 0.94285714 0.90410959 0.97142857 0.97101449
|
|
0.94366197 0.88888889 1. 0.95714286]
|
|
|
|
mean value: 0.9425347645398564
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00969172 0.00904489 0.00695467 0.00679207 0.00658846 0.00662065
|
|
0.00658751 0.00682616 0.00665498 0.00697279]
|
|
|
|
mean value: 0.007273387908935547
|
|
|
|
key: score_time
|
|
value: [0.01047778 0.01015592 0.00812674 0.0078907 0.00783539 0.00783157
|
|
0.00781465 0.00791001 0.0078311 0.00797391]
|
|
|
|
mean value: 0.008384776115417481
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.5 0.33928571 0.56407607 0.49099025 0.60714286
|
|
0.46428571 0.73214286 0.64465837 0.07142857]
|
|
|
|
mean value: 0.5295927517042964
|
|
|
|
key: train_mcc
|
|
value: [0.61098829 0.74337629 0.6462903 0.59999905 0.55137884 0.71313464
|
|
0.65613085 0.71021843 0.63063055 0.63867147]
|
|
|
|
mean value: 0.6500818694571209
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.75 0.66666667 0.73333333 0.73333333 0.8
|
|
0.73333333 0.86666667 0.8 0.53333333]
|
|
|
|
mean value: 0.7554166666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.80147059 0.86764706 0.81751825 0.79562044 0.76642336 0.8540146
|
|
0.81751825 0.84671533 0.81021898 0.81021898]
|
|
|
|
mean value: 0.8187365822241305
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.75 0.66666667 0.77777778 0.75 0.8
|
|
0.75 0.875 0.84210526 0.53333333]
|
|
|
|
mean value: 0.7686059511523908
|
|
|
|
key: train_fscore
|
|
value: [0.81632653 0.87671233 0.83443709 0.81333333 0.79487179 0.84615385
|
|
0.83660131 0.82644628 0.82432432 0.82894737]
|
|
|
|
mean value: 0.8298154200757712
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.75 0.625 0.63636364 0.66666667 0.75
|
|
0.75 0.875 0.72727273 0.57142857]
|
|
|
|
mean value: 0.7240620490620491
|
|
|
|
key: train_precision
|
|
value: [0.75949367 0.82051282 0.76829268 0.75308642 0.71264368 0.90163934
|
|
0.75294118 0.94339623 0.7625 0.75 ]
|
|
|
|
mean value: 0.7924506019387709
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.71428571 1. 0.85714286 0.85714286
|
|
0.75 0.875 1. 0.5 ]
|
|
|
|
mean value: 0.8303571428571428
|
|
|
|
key: train_recall
|
|
value: [0.88235294 0.94117647 0.91304348 0.88405797 0.89855072 0.79710145
|
|
0.94117647 0.73529412 0.89705882 0.92647059]
|
|
|
|
mean value: 0.8816283034953112
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.75 0.66964286 0.75 0.74107143 0.80357143
|
|
0.73214286 0.86607143 0.78571429 0.53571429]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.80147059 0.86764706 0.81681586 0.79497016 0.76545183 0.85443308
|
|
0.81841432 0.84590793 0.81084825 0.81106138]
|
|
|
|
mean value: 0.8187020460358057
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.6 0.5 0.63636364 0.6 0.66666667
|
|
0.6 0.77777778 0.72727273 0.36363636]
|
|
|
|
mean value: 0.636060606060606
|
|
|
|
key: train_jcc
|
|
value: [0.68965517 0.7804878 0.71590909 0.68539326 0.65957447 0.73333333
|
|
0.71910112 0.70422535 0.70114943 0.70786517]
|
|
|
|
mean value: 0.7096694197581203
|
|
|
|
MCC on Blind test: 0.03
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00759959 0.00745654 0.0069356 0.00687218 0.00700045 0.00688291
|
|
0.0068655 0.00682497 0.00688171 0.00696039]
|
|
|
|
mean value: 0.007027983665466309
|
|
|
|
key: score_time
|
|
value: [0.00797582 0.00790715 0.00785089 0.0078764 0.00810456 0.00792956
|
|
0.0079031 0.00789905 0.00793099 0.00804949]
|
|
|
|
mean value: 0.007942700386047363
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.25819889 0.07142857 0.49099025 0.47245559 0.13363062
|
|
0.46428571 0.73214286 0.33928571 0.32732684]
|
|
|
|
mean value: 0.36677095205019633
|
|
|
|
key: train_mcc
|
|
value: [0.5008673 0.53311399 0.52059257 0.45151662 0.49006025 0.5360985
|
|
0.52559229 0.51215762 0.49197671 0.53517487]
|
|
|
|
mean value: 0.5097150730382196
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.625 0.53333333 0.73333333 0.73333333 0.53333333
|
|
0.73333333 0.86666667 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6779166666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.75 0.76470588 0.75912409 0.72262774 0.74452555 0.76642336
|
|
0.75912409 0.75182482 0.74452555 0.76642336]
|
|
|
|
mean value: 0.7529304422498926
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.57142857 0.53333333 0.75 0.66666667 0.63157895
|
|
0.75 0.875 0.66666667 0.70588235]
|
|
|
|
mean value: 0.6856438891346012
|
|
|
|
key: train_fscore
|
|
value: [0.75714286 0.77777778 0.77241379 0.74666667 0.75524476 0.78082192
|
|
0.7755102 0.77027027 0.75524476 0.77464789]
|
|
|
|
mean value: 0.7665740884664326
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.66666667 0.5 0.66666667 0.8 0.5
|
|
0.75 0.875 0.71428571 0.66666667]
|
|
|
|
mean value: 0.680595238095238
|
|
|
|
key: train_precision
|
|
value: [0.73611111 0.73684211 0.73684211 0.69135802 0.72972973 0.74025974
|
|
0.72151899 0.7125 0.72 0.74324324]
|
|
|
|
mean value: 0.726840504690327
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 0.57142857 0.85714286 0.57142857 0.85714286
|
|
0.75 0.875 0.625 0.75 ]
|
|
|
|
mean value: 0.7107142857142857
|
|
|
|
key: train_recall
|
|
value: [0.77941176 0.82352941 0.8115942 0.8115942 0.7826087 0.82608696
|
|
0.83823529 0.83823529 0.79411765 0.80882353]
|
|
|
|
mean value: 0.8114236999147485
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.625 0.53571429 0.74107143 0.72321429 0.55357143
|
|
0.73214286 0.86607143 0.66964286 0.66071429]
|
|
|
|
mean value: 0.6794642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.75 0.76470588 0.75873828 0.72197357 0.74424552 0.76598465
|
|
0.75969736 0.75245098 0.74488491 0.76673061]
|
|
|
|
mean value: 0.7529411764705882
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.4 0.36363636 0.6 0.5 0.46153846
|
|
0.6 0.77777778 0.5 0.54545455]
|
|
|
|
mean value: 0.5293861693861693
|
|
|
|
key: train_jcc
|
|
value: [0.6091954 0.63636364 0.62921348 0.59574468 0.60674157 0.64044944
|
|
0.63333333 0.62637363 0.60674157 0.63218391]
|
|
|
|
mean value: 0.6216340654682218
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00746417 0.00665522 0.00723457 0.00721335 0.00727034 0.00734687
|
|
0.00666738 0.00743985 0.00722885 0.00735736]
|
|
|
|
mean value: 0.007187795639038086
|
|
|
|
key: score_time
|
|
value: [0.009269 0.00891018 0.00945759 0.00945568 0.00962353 0.01024604
|
|
0.00979686 0.00947714 0.00947309 0.00945425]
|
|
|
|
mean value: 0.009516334533691407
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.25819889 0.33928571 0.66143783 0.76376262 0.60714286
|
|
0.37796447 0.75592895 0.64465837 0.47245559]
|
|
|
|
mean value: 0.5397233065771696
|
|
|
|
key: train_mcc
|
|
value: [0.63242133 0.69486799 0.73721228 0.640228 0.69398264 0.64981886
|
|
0.69976319 0.63512361 0.69352089 0.63574336]
|
|
|
|
mean value: 0.6712682142948946
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.625 0.66666667 0.8 0.86666667 0.8
|
|
0.66666667 0.86666667 0.8 0.73333333]
|
|
|
|
mean value: 0.7575000000000001
|
|
|
|
key: train_accuracy
|
|
value: [0.81617647 0.84558824 0.86861314 0.81751825 0.84671533 0.82481752
|
|
0.84671533 0.81751825 0.84671533 0.81751825]
|
|
|
|
mean value: 0.8347896092743666
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.57142857 0.66666667 0.82352941 0.875 0.8
|
|
0.61538462 0.88888889 0.84210526 0.77777778]
|
|
|
|
mean value: 0.7638558972846898
|
|
|
|
key: train_fscore
|
|
value: [0.81751825 0.85314685 0.86956522 0.82993197 0.85106383 0.82857143
|
|
0.85517241 0.81751825 0.84671533 0.82014388]
|
|
|
|
mean value: 0.8389347425188644
|
|
|
|
key: test_precision
|
|
value: [0.7 0.66666667 0.625 0.7 0.77777778 0.75
|
|
0.8 0.8 0.72727273 0.7 ]
|
|
|
|
mean value: 0.7246717171717172
|
|
|
|
key: train_precision
|
|
value: [0.8115942 0.81333333 0.86956522 0.78205128 0.83333333 0.81690141
|
|
0.80519481 0.8115942 0.84057971 0.8028169 ]
|
|
|
|
mean value: 0.8186964397105242
|
|
|
|
key: test_recall
|
|
value: [0.875 0.5 0.71428571 1. 1. 0.85714286
|
|
0.5 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8321428571428572
|
|
|
|
key: train_recall
|
|
value: [0.82352941 0.89705882 0.86956522 0.88405797 0.86956522 0.84057971
|
|
0.91176471 0.82352941 0.85294118 0.83823529]
|
|
|
|
mean value: 0.8610826939471441
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.625 0.66964286 0.8125 0.875 0.80357143
|
|
0.67857143 0.85714286 0.78571429 0.72321429]
|
|
|
|
mean value: 0.7580357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.81617647 0.84558824 0.86860614 0.81702899 0.84654731 0.82470162
|
|
0.8471867 0.81756181 0.84676044 0.81766837]
|
|
|
|
mean value: 0.8347826086956521
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.4 0.5 0.7 0.77777778 0.66666667
|
|
0.44444444 0.8 0.72727273 0.63636364]
|
|
|
|
mean value: 0.6288888888888888
|
|
|
|
key: train_jcc
|
|
value: [0.69135802 0.74390244 0.76923077 0.70930233 0.74074074 0.70731707
|
|
0.74698795 0.69135802 0.73417722 0.69512195]
|
|
|
|
mean value: 0.7229496515347358
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.61
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0097928 0.00793552 0.0076406 0.00769567 0.00771952 0.00770187
|
|
0.00762033 0.00775814 0.00766444 0.00767016]
|
|
|
|
mean value: 0.007919907569885254
|
|
|
|
key: score_time
|
|
value: [0.00912237 0.00797772 0.00790691 0.00800848 0.00794578 0.00804377
|
|
0.00801921 0.00795007 0.00798464 0.00796342]
|
|
|
|
mean value: 0.008092236518859864
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.5 0.19642857 0.76376262 0.73214286 0.73214286
|
|
0.66143783 1. 0.64465837 0.60714286]
|
|
|
|
mean value: 0.6587715957669568
|
|
|
|
key: train_mcc
|
|
value: [0.79446135 0.76470588 0.78182997 0.82480818 0.79590547 0.79590547
|
|
0.79560955 0.79560955 0.78298457 0.78107015]
|
|
|
|
mean value: 0.7912890152297882
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.75 0.6 0.86666667 0.86666667 0.86666667
|
|
0.8 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.8225
|
|
|
|
key: train_accuracy
|
|
value: [0.89705882 0.88235294 0.89051095 0.91240876 0.89781022 0.89781022
|
|
0.89781022 0.89781022 0.89051095 0.89051095]
|
|
|
|
mean value: 0.8954594246457708
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.75 0.57142857 0.875 0.85714286 0.85714286
|
|
0.76923077 1. 0.84210526 0.8 ]
|
|
|
|
mean value: 0.819705031810295
|
|
|
|
key: train_fscore
|
|
value: [0.89552239 0.88235294 0.88888889 0.91304348 0.9 0.9
|
|
0.89705882 0.89705882 0.88549618 0.88888889]
|
|
|
|
mean value: 0.894831041553975
|
|
|
|
key: test_precision
|
|
value: [0.875 0.75 0.57142857 0.77777778 0.85714286 0.85714286
|
|
1. 1. 0.72727273 0.85714286]
|
|
|
|
mean value: 0.8272907647907648
|
|
|
|
key: train_precision
|
|
value: [0.90909091 0.88235294 0.90909091 0.91304348 0.88732394 0.88732394
|
|
0.89705882 0.89705882 0.92063492 0.89552239]
|
|
|
|
mean value: 0.8998501080696548
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.57142857 1. 0.85714286 0.85714286
|
|
0.625 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.88235294 0.88235294 0.86956522 0.91304348 0.91304348 0.91304348
|
|
0.89705882 0.89705882 0.85294118 0.88235294]
|
|
|
|
mean value: 0.8902813299232737
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.75 0.59821429 0.875 0.86607143 0.86607143
|
|
0.8125 1. 0.78571429 0.80357143]
|
|
|
|
mean value: 0.8232142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.89705882 0.88235294 0.89066496 0.91240409 0.89769821 0.89769821
|
|
0.89780477 0.89780477 0.8902387 0.89045183]
|
|
|
|
mean value: 0.8954177323103154
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.6 0.4 0.77777778 0.75 0.75
|
|
0.625 1. 0.72727273 0.66666667]
|
|
|
|
mean value: 0.7074494949494949
|
|
|
|
key: train_jcc
|
|
value: [0.81081081 0.78947368 0.8 0.84 0.81818182 0.81818182
|
|
0.81333333 0.81333333 0.79452055 0.8 ]
|
|
|
|
mean value: 0.8097835345996846
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.61664748 0.47898817 0.45502543 0.50637126 0.53669834 0.53992033
|
|
0.47287774 0.47633958 0.47858143 0.62103176]
|
|
|
|
mean value: 0.5182481527328491
|
|
|
|
key: score_time
|
|
value: [0.01329303 0.01312971 0.01097465 0.01341534 0.01492548 0.01332402
|
|
0.01094341 0.01338291 0.01885128 0.01098609]
|
|
|
|
mean value: 0.013322591781616211
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.51639778 0.37796447 1. 0.60714286 0.60714286
|
|
0.46428571 0.87287156 0.64465837 0.60714286]
|
|
|
|
mean value: 0.6579523574070305
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.75 0.66666667 1. 0.8 0.8
|
|
0.73333333 0.93333333 0.8 0.8 ]
|
|
|
|
mean value: 0.8220833333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.71428571 0.70588235 1. 0.8 0.8
|
|
0.75 0.94117647 0.84210526 0.8 ]
|
|
|
|
mean value: 0.8286783134306354
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.83333333 0.6 1. 0.75 0.75
|
|
0.75 0.88888889 0.72727273 0.85714286]
|
|
|
|
mean value: 0.8156637806637806
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.625 0.85714286 1. 0.85714286 0.85714286
|
|
0.75 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.75 0.67857143 1. 0.80357143 0.80357143
|
|
0.73214286 0.92857143 0.78571429 0.80357143]
|
|
|
|
mean value: 0.8223214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.55555556 0.54545455 1. 0.66666667 0.66666667
|
|
0.6 0.88888889 0.72727273 0.66666667]
|
|
|
|
mean value: 0.7192171717171717
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02302074 0.00763321 0.00718451 0.00732636 0.00716519 0.00725317
|
|
0.00720024 0.00724506 0.00735903 0.00749445]
|
|
|
|
mean value: 0.00888819694519043
|
|
|
|
key: score_time
|
|
value: [0.01008129 0.00808263 0.00788021 0.00784898 0.00779343 0.00776839
|
|
0.00773787 0.00773025 0.00829577 0.00780368]
|
|
|
|
mean value: 0.00810225009918213
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 1. 1. 1. 0.6000992 0.73214286
|
|
0.87287156 0.75592895 0.73214286 0.56407607]
|
|
|
|
mean value: 0.8139178597903081
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 1. 1. 1. 0.8 0.86666667
|
|
0.93333333 0.86666667 0.86666667 0.73333333]
|
|
|
|
mean value: 0.9004166666666666
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 1. 1. 1. 0.76923077 0.85714286
|
|
0.94117647 0.88888889 0.875 0.66666667]
|
|
|
|
mean value: 0.8939282123105652
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 1. 1. 0.83333333 0.85714286
|
|
0.88888889 0.8 0.875 1. ]
|
|
|
|
mean value: 0.9143253968253968
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.71428571 0.85714286
|
|
1. 1. 0.875 0.5 ]
|
|
|
|
mean value: 0.8946428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 1. 1. 1. 0.79464286 0.86607143
|
|
0.92857143 0.85714286 0.86607143 0.75 ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 1. 1. 1. 0.625 0.75
|
|
0.88888889 0.8 0.77777778 0.5 ]
|
|
|
|
mean value: 0.8230555555555555
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07873535 0.07909012 0.07848859 0.07919955 0.07896852 0.07857132
|
|
0.08103371 0.08165836 0.07918024 0.08250403]
|
|
|
|
mean value: 0.07974298000335693
|
|
|
|
key: score_time
|
|
value: [0.01622057 0.01643443 0.01677704 0.01642728 0.01640582 0.01631761
|
|
0.01749635 0.01630569 0.01675391 0.01715064]
|
|
|
|
mean value: 0.01662893295288086
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.51639778 0.49099025 1. 0.875 0.73214286
|
|
0.76376262 1. 0.75592895 0.875 ]
|
|
|
|
mean value: 0.7891139555200787
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.75 0.73333333 1. 0.93333333 0.86666667
|
|
0.86666667 1. 0.86666667 0.93333333]
|
|
|
|
mean value: 0.88875
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.71428571 0.75 1. 0.93333333 0.85714286
|
|
0.85714286 1. 0.88888889 0.93333333]
|
|
|
|
mean value: 0.8867460317460317
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.83333333 0.66666667 1. 0.875 0.85714286
|
|
1. 1. 0.8 1. ]
|
|
|
|
mean value: 0.9032142857142857
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.625 0.85714286 1. 1. 0.85714286
|
|
0.75 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8839285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.75 0.74107143 1. 0.9375 0.86607143
|
|
0.875 1. 0.85714286 0.9375 ]
|
|
|
|
mean value: 0.8901785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.55555556 0.6 1. 0.875 0.75
|
|
0.75 1. 0.8 0.875 ]
|
|
|
|
mean value: 0.8080555555555555
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00679111 0.00661206 0.00676751 0.00668883 0.00663257 0.00666237
|
|
0.00663257 0.00670314 0.00696945 0.00672388]
|
|
|
|
mean value: 0.006718349456787109
|
|
|
|
key: score_time
|
|
value: [0.00769448 0.00768995 0.00776768 0.00772476 0.00775385 0.00774527
|
|
0.00774169 0.00774693 0.00784659 0.00774169]
|
|
|
|
mean value: 0.007745289802551269
|
|
|
|
key: test_mcc
|
|
value: [0.40451992 0.40451992 0.32732684 1. 0.76376262 0.46428571
|
|
0.13363062 0.87287156 0.73214286 0.21821789]
|
|
|
|
mean value: 0.5321277929700597
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.6875 0.66666667 1. 0.86666667 0.73333333
|
|
0.53333333 0.93333333 0.86666667 0.6 ]
|
|
|
|
mean value: 0.7575
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.61538462 0.61538462 1. 0.875 0.71428571
|
|
0.36363636 0.94117647 0.875 0.57142857]
|
|
|
|
mean value: 0.7308138455971274
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.8 0.66666667 1. 0.77777778 0.71428571
|
|
0.66666667 0.88888889 0.875 0.66666667]
|
|
|
|
mean value: 0.7692316017316018
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.5 0.57142857 1. 1. 0.71428571
|
|
0.25 1. 0.875 0.5 ]
|
|
|
|
mean value: 0.7285714285714285
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.6875 0.66071429 1. 0.875 0.73214286
|
|
0.55357143 0.92857143 0.86607143 0.60714286]
|
|
|
|
mean value: 0.7598214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.44444444 0.44444444 1. 0.77777778 0.55555556
|
|
0.22222222 0.88888889 0.77777778 0.4 ]
|
|
|
|
mean value: 0.6094444444444445
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.98909354 0.98514724 1.04687738 0.98289633 0.98306084 0.98102474
|
|
0.9808023 0.98257184 0.98120975 0.97967005]
|
|
|
|
mean value: 0.9892354011535645
|
|
|
|
key: score_time
|
|
value: [0.09175563 0.08826041 0.08760238 0.08777761 0.08745551 0.08774495
|
|
0.08790946 0.08762598 0.08742118 0.08845329]
|
|
|
|
mean value: 0.08820064067840576
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.76376262 1. 0.875 0.73214286
|
|
0.60714286 0.87287156 0.87287156 0.76376262]
|
|
|
|
mean value: 0.8119471171513797
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.86666667 1. 0.93333333 0.86666667
|
|
0.8 0.93333333 0.93333333 0.86666667]
|
|
|
|
mean value: 0.90125
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.875 0.875 1. 0.93333333 0.85714286
|
|
0.8 0.94117647 0.94117647 0.85714286]
|
|
|
|
mean value: 0.9013305322128852
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.77777778 1. 0.875 0.85714286
|
|
0.85714286 0.88888889 0.88888889 1. ]
|
|
|
|
mean value: 0.901984126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 1. 1. 0.85714286
|
|
0.75 1. 1. 0.75 ]
|
|
|
|
mean value: 0.9107142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.875 1. 0.9375 0.86607143
|
|
0.80357143 0.92857143 0.92857143 0.875 ]
|
|
|
|
mean value: 0.9026785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.77777778 0.77777778 1. 0.875 0.75
|
|
0.66666667 0.88888889 0.88888889 0.75 ]
|
|
|
|
mean value: 0.825
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.8277626 0.8267982 0.83943486 0.95936847 0.89719224 0.9292078
|
|
0.87691498 0.90619445 0.85252666 0.84288502]
|
|
|
|
mean value: 0.8758285284042359
|
|
|
|
key: score_time
|
|
value: [0.23116565 0.20367575 0.20599627 0.15598726 0.19488597 0.1595974
|
|
0.24725604 0.22806668 0.2303443 0.21640897]
|
|
|
|
mean value: 0.20733842849731446
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.76376262 1. 0.875 0.73214286
|
|
0.60714286 0.87287156 0.87287156 0.66143783]
|
|
|
|
mean value: 0.8017146383453971
|
|
|
|
key: train_mcc
|
|
value: [0.98540068 0.98540068 0.95630861 0.98550418 0.98550418 0.98550418
|
|
0.98550418 0.97080136 0.97080136 0.98550418]
|
|
|
|
mean value: 0.9796233587390223
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.86666667 1. 0.93333333 0.86666667
|
|
0.8 0.93333333 0.93333333 0.8 ]
|
|
|
|
mean value: 0.8945833333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.99264706 0.99264706 0.97810219 0.99270073 0.99270073 0.99270073
|
|
0.99270073 0.98540146 0.98540146 0.99270073]
|
|
|
|
mean value: 0.9897702876771146
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.875 0.875 1. 0.93333333 0.85714286
|
|
0.8 0.94117647 0.94117647 0.76923077]
|
|
|
|
mean value: 0.8925393234216764
|
|
|
|
key: train_fscore
|
|
value: [0.99259259 0.99259259 0.97810219 0.99280576 0.99280576 0.99280576
|
|
0.99259259 0.98529412 0.98529412 0.99259259]
|
|
|
|
mean value: 0.9897478061632561
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.77777778 1. 0.875 0.85714286
|
|
0.85714286 0.88888889 0.88888889 1. ]
|
|
|
|
mean value: 0.901984126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.98529412 0.98571429 0.98571429 0.98571429
|
|
1. 0.98529412 0.98529412 1. ]
|
|
|
|
mean value: 0.9913025210084034
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 1. 1. 0.85714286
|
|
0.75 1. 1. 0.625 ]
|
|
|
|
mean value: 0.8982142857142857
|
|
|
|
key: train_recall
|
|
value: [0.98529412 0.98529412 0.97101449 1. 1. 1.
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9882779198635976
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.875 1. 0.9375 0.86607143
|
|
0.80357143 0.92857143 0.92857143 0.8125 ]
|
|
|
|
mean value: 0.8964285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.99264706 0.99264706 0.97815431 0.99264706 0.99264706 0.99264706
|
|
0.99264706 0.98540068 0.98540068 0.99264706]
|
|
|
|
mean value: 0.9897485080988918
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.77777778 0.77777778 1. 0.875 0.75
|
|
0.66666667 0.88888889 0.88888889 0.625 ]
|
|
|
|
mean value: 0.8125
|
|
|
|
key: train_jcc
|
|
value: [0.98529412 0.98529412 0.95714286 0.98571429 0.98571429 0.98571429
|
|
0.98529412 0.97101449 0.97101449 0.98529412]
|
|
|
|
mean value: 0.9797491170381196
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01690888 0.00677323 0.00677204 0.00677943 0.0067265 0.00685477
|
|
0.00684571 0.00681353 0.00681353 0.00683665]
|
|
|
|
mean value: 0.00781242847442627
|
|
|
|
key: score_time
|
|
value: [0.01041579 0.00778389 0.00794005 0.00778031 0.00778699 0.00781918
|
|
0.00782156 0.00780797 0.00778556 0.0078373 ]
|
|
|
|
mean value: 0.008077859878540039
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.25819889 0.07142857 0.49099025 0.47245559 0.13363062
|
|
0.46428571 0.73214286 0.33928571 0.32732684]
|
|
|
|
mean value: 0.36677095205019633
|
|
|
|
key: train_mcc
|
|
value: [0.5008673 0.53311399 0.52059257 0.45151662 0.49006025 0.5360985
|
|
0.52559229 0.51215762 0.49197671 0.53517487]
|
|
|
|
mean value: 0.5097150730382196
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.625 0.53333333 0.73333333 0.73333333 0.53333333
|
|
0.73333333 0.86666667 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6779166666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.75 0.76470588 0.75912409 0.72262774 0.74452555 0.76642336
|
|
0.75912409 0.75182482 0.74452555 0.76642336]
|
|
|
|
mean value: 0.7529304422498926
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.57142857 0.53333333 0.75 0.66666667 0.63157895
|
|
0.75 0.875 0.66666667 0.70588235]
|
|
|
|
mean value: 0.6856438891346012
|
|
|
|
key: train_fscore
|
|
value: [0.75714286 0.77777778 0.77241379 0.74666667 0.75524476 0.78082192
|
|
0.7755102 0.77027027 0.75524476 0.77464789]
|
|
|
|
mean value: 0.7665740884664326
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.66666667 0.5 0.66666667 0.8 0.5
|
|
0.75 0.875 0.71428571 0.66666667]
|
|
|
|
mean value: 0.680595238095238
|
|
|
|
key: train_precision
|
|
value: [0.73611111 0.73684211 0.73684211 0.69135802 0.72972973 0.74025974
|
|
0.72151899 0.7125 0.72 0.74324324]
|
|
|
|
mean value: 0.726840504690327
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 0.57142857 0.85714286 0.57142857 0.85714286
|
|
0.75 0.875 0.625 0.75 ]
|
|
|
|
mean value: 0.7107142857142857
|
|
|
|
key: train_recall
|
|
value: [0.77941176 0.82352941 0.8115942 0.8115942 0.7826087 0.82608696
|
|
0.83823529 0.83823529 0.79411765 0.80882353]
|
|
|
|
mean value: 0.8114236999147485
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.625 0.53571429 0.74107143 0.72321429 0.55357143
|
|
0.73214286 0.86607143 0.66964286 0.66071429]
|
|
|
|
mean value: 0.6794642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.75 0.76470588 0.75873828 0.72197357 0.74424552 0.76598465
|
|
0.75969736 0.75245098 0.74488491 0.76673061]
|
|
|
|
mean value: 0.7529411764705882
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.4 0.36363636 0.6 0.5 0.46153846
|
|
0.6 0.77777778 0.5 0.54545455]
|
|
|
|
mean value: 0.5293861693861693
|
|
|
|
key: train_jcc
|
|
value: [0.6091954 0.63636364 0.62921348 0.59574468 0.60674157 0.64044944
|
|
0.63333333 0.62637363 0.60674157 0.63218391]
|
|
|
|
mean value: 0.6216340654682218
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.09977555 0.03077435 0.03092337 0.03185725 0.03266478 0.20152545
|
|
0.03012586 0.03033113 0.03158212 0.03259635]
|
|
|
|
mean value: 0.05521562099456787
|
|
|
|
key: score_time
|
|
value: [0.01020741 0.00965858 0.00987267 0.0099175 0.01043272 0.01017642
|
|
0.00950527 0.0099225 0.00961161 0.00984406]
|
|
|
|
mean value: 0.009914875030517578
|
|
|
|
key: test_mcc
|
|
value: [1. 0.75 1. 1. 0.73214286 1.
|
|
0.87287156 1. 0.87287156 0.76376262]
|
|
|
|
mean value: 0.8991648594856769
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 1. 1. 0.86666667 1.
|
|
0.93333333 1. 0.93333333 0.86666667]
|
|
|
|
mean value: 0.9475
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.875 1. 1. 0.85714286 1.
|
|
0.94117647 1. 0.94117647 0.85714286]
|
|
|
|
mean value: 0.9471638655462185
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 1. 1. 0.85714286 1.
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9509920634920634
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 1. 1. 0.85714286 1.
|
|
1. 1. 1. 0.75 ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 1. 1. 0.86607143 1.
|
|
0.92857143 1. 0.92857143 0.875 ]
|
|
|
|
mean value: 0.9473214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.77777778 1. 1. 0.75 1.
|
|
0.88888889 1. 0.88888889 0.75 ]
|
|
|
|
mean value: 0.9055555555555556
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00941396 0.01151013 0.01147294 0.01190257 0.0120573 0.01343966
|
|
0.01201916 0.01195407 0.01195812 0.01198363]
|
|
|
|
mean value: 0.011771154403686524
|
|
|
|
key: score_time
|
|
value: [0.01016879 0.00986719 0.01031709 0.01051497 0.01036811 0.0106349
|
|
0.01084495 0.01056862 0.01060319 0.01060867]
|
|
|
|
mean value: 0.010449647903442383
|
|
|
|
key: test_mcc
|
|
value: [1. 0.62994079 0.49099025 1. 0.875 0.73214286
|
|
0.87287156 1. 0.75592895 0.75592895]
|
|
|
|
mean value: 0.811280335150343
|
|
|
|
key: train_mcc
|
|
value: [0.91215932 0.95681396 0.92944673 0.88466669 0.89863497 0.94199209
|
|
0.90025835 0.9139999 0.91281179 0.87099729]
|
|
|
|
mean value: 0.9121781087453906
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.8125 0.73333333 1. 0.93333333 0.86666667
|
|
0.93333333 1. 0.86666667 0.86666667]
|
|
|
|
mean value: 0.90125
|
|
|
|
key: train_accuracy
|
|
value: [0.95588235 0.97794118 0.96350365 0.94160584 0.94890511 0.97080292
|
|
0.94890511 0.95620438 0.95620438 0.93430657]
|
|
|
|
mean value: 0.9554261485616145
|
|
|
|
key: test_fscore
|
|
value: [1. 0.82352941 0.75 1. 0.93333333 0.85714286
|
|
0.94117647 1. 0.88888889 0.88888889]
|
|
|
|
mean value: 0.908295985060691
|
|
|
|
key: train_fscore
|
|
value: [0.95652174 0.97841727 0.96503497 0.94366197 0.95035461 0.97142857
|
|
0.95035461 0.95714286 0.95652174 0.93617021]
|
|
|
|
mean value: 0.9565608542509413
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.66666667 1. 0.875 0.85714286
|
|
0.88888889 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.866547619047619
|
|
|
|
key: train_precision
|
|
value: [0.94285714 0.95774648 0.93243243 0.91780822 0.93055556 0.95774648
|
|
0.91780822 0.93055556 0.94285714 0.90410959]
|
|
|
|
mean value: 0.9334476814401569
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 1. 1. 0.85714286
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [0.97058824 1. 1. 0.97101449 0.97101449 0.98550725
|
|
0.98529412 0.98529412 0.97058824 0.97058824]
|
|
|
|
mean value: 0.9809889173060529
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.74107143 1. 0.9375 0.86607143
|
|
0.92857143 1. 0.85714286 0.85714286]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_roc_auc
|
|
value: [0.95588235 0.97794118 0.96323529 0.9413896 0.94874254 0.9706948
|
|
0.9491688 0.95641517 0.95630861 0.93456948]
|
|
|
|
mean value: 0.9554347826086956
|
|
|
|
key: test_jcc
|
|
value: [1. 0.7 0.6 1. 0.875 0.75
|
|
0.88888889 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.8413888888888889
|
|
|
|
key: train_jcc
|
|
value: [0.91666667 0.95774648 0.93243243 0.89333333 0.90540541 0.94444444
|
|
0.90540541 0.91780822 0.91666667 0.88 ]
|
|
|
|
mean value: 0.9169909052405676
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02651024 0.0071528 0.00675678 0.00662589 0.00688267 0.00663829
|
|
0.00680137 0.00680256 0.00683355 0.00683928]
|
|
|
|
mean value: 0.008784341812133788
|
|
|
|
key: score_time
|
|
value: [0.01571369 0.00825262 0.00792074 0.0078783 0.00784111 0.00788522
|
|
0.00770473 0.00788617 0.00793123 0.00775051]
|
|
|
|
mean value: 0.008676433563232422
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.37796447 0.21821789 0.60714286 0.73214286 0.26189246
|
|
0.66143783 0.87287156 0.46428571 0.46428571]
|
|
|
|
mean value: 0.529018214646944
|
|
|
|
key: train_mcc
|
|
value: [0.55979287 0.57408838 0.62076318 0.57703846 0.54864511 0.60584099
|
|
0.57730871 0.51887407 0.56235346 0.56235346]
|
|
|
|
mean value: 0.5707058671664582
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.6875 0.6 0.8 0.86666667 0.6
|
|
0.8 0.93333333 0.73333333 0.73333333]
|
|
|
|
mean value: 0.7566666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.77941176 0.78676471 0.81021898 0.78832117 0.77372263 0.80291971
|
|
0.78832117 0.75912409 0.7810219 0.7810219 ]
|
|
|
|
mean value: 0.785084800343495
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.66666667 0.625 0.8 0.85714286 0.66666667
|
|
0.76923077 0.94117647 0.75 0.75 ]
|
|
|
|
mean value: 0.7625883430295195
|
|
|
|
key: train_fscore
|
|
value: [0.78571429 0.79136691 0.80882353 0.79432624 0.78321678 0.8057554
|
|
0.79136691 0.76258993 0.7826087 0.7826087 ]
|
|
|
|
mean value: 0.7888377367472581
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.71428571 0.55555556 0.75 0.85714286 0.54545455
|
|
1. 0.88888889 0.75 0.75 ]
|
|
|
|
mean value: 0.7668470418470419
|
|
|
|
key: train_precision
|
|
value: [0.76388889 0.77464789 0.82089552 0.77777778 0.75675676 0.8
|
|
0.77464789 0.74647887 0.77142857 0.77142857]
|
|
|
|
mean value: 0.775795073655595
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.71428571 0.85714286 0.85714286 0.85714286
|
|
0.625 1. 0.75 0.75 ]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_recall
|
|
value: [0.80882353 0.80882353 0.79710145 0.8115942 0.8115942 0.8115942
|
|
0.80882353 0.77941176 0.79411765 0.79411765]
|
|
|
|
mean value: 0.8026001705029838
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.6875 0.60714286 0.80357143 0.86607143 0.61607143
|
|
0.8125 0.92857143 0.73214286 0.73214286]
|
|
|
|
mean value: 0.7598214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [0.77941176 0.78676471 0.81031543 0.78815004 0.77344416 0.80285592
|
|
0.78846974 0.7592711 0.78111679 0.78111679]
|
|
|
|
mean value: 0.7850916453537937
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.5 0.45454545 0.66666667 0.75 0.5
|
|
0.625 0.88888889 0.6 0.6 ]
|
|
|
|
mean value: 0.6251767676767677
|
|
|
|
key: train_jcc
|
|
value: [0.64705882 0.6547619 0.67901235 0.65882353 0.64367816 0.6746988
|
|
0.6547619 0.61627907 0.64285714 0.64285714]
|
|
|
|
mean value: 0.651478881972599
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00804782 0.00781918 0.00786948 0.00782323 0.00760174 0.00795841
|
|
0.00802374 0.00730324 0.00732517 0.00728822]
|
|
|
|
mean value: 0.0077060222625732425
|
|
|
|
key: score_time
|
|
value: [0.00777936 0.00796533 0.00840831 0.00785279 0.00842953 0.00844431
|
|
0.00777602 0.00784135 0.00779104 0.00782919]
|
|
|
|
mean value: 0.008011722564697265
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.49099025 1. 0.73214286 0.60714286
|
|
0.6000992 1. 0.64465837 0.6000992 ]
|
|
|
|
mean value: 0.7186990626871869
|
|
|
|
key: train_mcc
|
|
value: [0.89949371 0.91215932 0.92791659 0.88466669 0.94199209 0.94160273
|
|
0.88938138 0.8687127 0.84688958 0.86000692]
|
|
|
|
mean value: 0.8972821710057162
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.73333333 1. 0.86666667 0.8
|
|
0.8 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.855
|
|
|
|
key: train_accuracy
|
|
value: [0.94852941 0.95588235 0.96350365 0.94160584 0.97080292 0.97080292
|
|
0.94160584 0.93430657 0.91970803 0.9270073 ]
|
|
|
|
mean value: 0.9473754830399312
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.75 1. 0.85714286 0.8
|
|
0.82352941 1. 0.84210526 0.82352941]
|
|
|
|
mean value: 0.8653169688928203
|
|
|
|
key: train_fscore
|
|
value: [0.94656489 0.95652174 0.96296296 0.94366197 0.97142857 0.97101449
|
|
0.94444444 0.93430657 0.92413793 0.93055556]
|
|
|
|
mean value: 0.9485599123980311
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.66666667 1. 0.85714286 0.75
|
|
0.77777778 1. 0.72727273 0.77777778]
|
|
|
|
mean value: 0.8334415584415584
|
|
|
|
key: train_precision
|
|
value: [0.98412698 0.94285714 0.98484848 0.91780822 0.95774648 0.97101449
|
|
0.89473684 0.92753623 0.87012987 0.88157895]
|
|
|
|
mean value: 0.9332383694125169
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 1. 0.85714286 0.85714286
|
|
0.875 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9071428571428571
|
|
|
|
key: train_recall
|
|
value: [0.91176471 0.97058824 0.94202899 0.97101449 0.98550725 0.97101449
|
|
1. 0.94117647 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9663682864450128
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.74107143 1. 0.86607143 0.80357143
|
|
0.79464286 1. 0.78571429 0.79464286]
|
|
|
|
mean value: 0.8535714285714285
|
|
|
|
key: train_roc_auc
|
|
value: [0.94852941 0.95588235 0.96366155 0.9413896 0.9706948 0.97080136
|
|
0.94202899 0.93435635 0.92018329 0.92742967]
|
|
|
|
mean value: 0.9474957374254049
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.6 1. 0.75 0.66666667
|
|
0.7 1. 0.72727273 0.7 ]
|
|
|
|
mean value: 0.7718939393939394
|
|
|
|
key: train_jcc
|
|
value: [0.89855072 0.91666667 0.92857143 0.89333333 0.94444444 0.94366197
|
|
0.89473684 0.87671233 0.85897436 0.87012987]
|
|
|
|
mean value: 0.9025781969461155
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00999784 0.0094552 0.00723362 0.00716352 0.00696826 0.00690985
|
|
0.00689554 0.00771546 0.0079 0.00781608]
|
|
|
|
mean value: 0.007805538177490234
|
|
|
|
key: score_time
|
|
value: [0.01038742 0.00955176 0.00792933 0.00781178 0.00785041 0.00781822
|
|
0.00781894 0.00777292 0.00842071 0.00786495]
|
|
|
|
mean value: 0.008322644233703613
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.49099025 0.875 0.76376262 0.60714286
|
|
0.46428571 0.53452248 0.46428571 0.47245559]
|
|
|
|
mean value: 0.6184303121694533
|
|
|
|
key: train_mcc
|
|
value: [0.88580789 0.81600218 0.92791659 0.9001543 0.80787444 0.80014442
|
|
0.8437116 0.64876322 0.87609014 0.86339318]
|
|
|
|
mean value: 0.836985797579123
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.73333333 0.93333333 0.86666667 0.8
|
|
0.73333333 0.73333333 0.73333333 0.73333333]
|
|
|
|
mean value: 0.8016666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.94117647 0.90441176 0.96350365 0.94890511 0.89781022 0.89051095
|
|
0.91970803 0.79562044 0.93430657 0.9270073 ]
|
|
|
|
mean value: 0.912296049806784
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.75 0.93333333 0.875 0.8
|
|
0.75 0.8 0.75 0.77777778]
|
|
|
|
mean value: 0.819297385620915
|
|
|
|
key: train_fscore
|
|
value: [0.93846154 0.91034483 0.96296296 0.95104895 0.90666667 0.90196078
|
|
0.91472868 0.82926829 0.92913386 0.93150685]
|
|
|
|
mean value: 0.9176083413476306
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.66666667 0.875 0.77777778 0.75
|
|
0.75 0.66666667 0.75 0.7 ]
|
|
|
|
mean value: 0.7713888888888889
|
|
|
|
key: train_precision
|
|
value: [0.98387097 0.85714286 0.98484848 0.91891892 0.83950617 0.82142857
|
|
0.96721311 0.70833333 1. 0.87179487]
|
|
|
|
mean value: 0.8953057292802578
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 1. 1. 0.85714286
|
|
0.75 1. 0.75 0.875 ]
|
|
|
|
mean value: 0.8839285714285714
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.97058824 0.94202899 0.98550725 0.98550725 1.
|
|
0.86764706 1. 0.86764706 1. ]
|
|
|
|
mean value: 0.9515984654731457
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.74107143 0.9375 0.875 0.80357143
|
|
0.73214286 0.71428571 0.73214286 0.72321429]
|
|
|
|
mean value: 0.8008928571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.94117647 0.90441176 0.96366155 0.94863598 0.89716539 0.88970588
|
|
0.91933078 0.79710145 0.93382353 0.92753623]
|
|
|
|
mean value: 0.9122549019607843
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.6 0.875 0.77777778 0.66666667
|
|
0.6 0.66666667 0.6 0.63636364]
|
|
|
|
mean value: 0.6997474747474748
|
|
|
|
key: train_jcc
|
|
value: [0.88405797 0.83544304 0.92857143 0.90666667 0.82926829 0.82142857
|
|
0.84285714 0.70833333 0.86764706 0.87179487]
|
|
|
|
mean value: 0.8496068375147647
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07770419 0.06228852 0.0625062 0.06266785 0.06289601 0.06246185
|
|
0.06297612 0.06292748 0.06235862 0.06280899]
|
|
|
|
mean value: 0.06415958404541015
|
|
|
|
key: score_time
|
|
value: [0.01418233 0.01393175 0.01422071 0.01399136 0.01391673 0.01394653
|
|
0.01503801 0.01420355 0.01413107 0.01432395]
|
|
|
|
mean value: 0.014188599586486817
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.875 0.875 0.73214286 0.87287156
|
|
0.87287156 1. 0.75592895 0.76376262]
|
|
|
|
mean value: 0.8379494644563421
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.93333333 0.93333333 0.86666667 0.93333333
|
|
0.93333333 1. 0.86666667 0.86666667]
|
|
|
|
mean value: 0.9145833333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.875 0.93333333 0.93333333 0.85714286 0.92307692
|
|
0.94117647 1. 0.88888889 0.85714286]
|
|
|
|
mean value: 0.9142427996839761
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.875 0.875 0.85714286 1.
|
|
0.88888889 1. 0.8 1. ]
|
|
|
|
mean value: 0.9171031746031746
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 1. 0.85714286 0.85714286
|
|
1. 1. 1. 0.75 ]
|
|
|
|
mean value: 0.9214285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.9375 0.9375 0.86607143 0.92857143
|
|
0.92857143 1. 0.85714286 0.875 ]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.77777778 0.875 0.875 0.75 0.85714286
|
|
0.88888889 1. 0.8 0.75 ]
|
|
|
|
mean value: 0.8448809523809524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02706838 0.02781153 0.04628038 0.03850269 0.0461607 0.04655218
|
|
0.04729891 0.04126883 0.03603816 0.0402298 ]
|
|
|
|
mean value: 0.03972115516662598
|
|
|
|
key: score_time
|
|
value: [0.02073336 0.02294326 0.03598142 0.040658 0.03594398 0.03722
|
|
0.03625917 0.02713251 0.02583647 0.03715944]
|
|
|
|
mean value: 0.03198676109313965
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 1. 1. 0.73214286 0.73214286
|
|
0.87287156 0.87287156 0.73214286 1. ]
|
|
|
|
mean value: 0.8706005900692904
|
|
|
|
key: train_mcc
|
|
value: [0.98540068 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9970907922626642
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 1. 1. 0.86666667 0.86666667
|
|
0.93333333 0.93333333 0.86666667 1. ]
|
|
|
|
mean value: 0.9341666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.99264706 1. 1. 1. 1. 1.
|
|
1. 0.99270073 1. 1. ]
|
|
|
|
mean value: 0.9985347788750537
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 1. 1. 0.85714286 0.85714286
|
|
0.94117647 0.94117647 0.875 1. ]
|
|
|
|
mean value: 0.9353991596638656
|
|
|
|
key: train_fscore
|
|
value: [0.99259259 1. 1. 1. 1. 1.
|
|
1. 0.99270073 1. 1. ]
|
|
|
|
mean value: 0.99852933225196
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 1. 1. 0.85714286 0.85714286
|
|
0.88888889 0.88888889 0.875 1. ]
|
|
|
|
mean value: 0.914484126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9985507246376811
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.85714286 0.85714286
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [0.98529412 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9985294117647059
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 1. 1. 0.86607143 0.86607143
|
|
0.92857143 0.92857143 0.86607143 1. ]
|
|
|
|
mean value: 0.9330357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.99264706 1. 1. 1. 1. 1.
|
|
1. 0.99275362 1. 1. ]
|
|
|
|
mean value: 0.9985400682011936
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 1. 1. 0.75 0.75
|
|
0.88888889 0.88888889 0.77777778 1. ]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_jcc
|
|
value: [0.98529412 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.997080136402387
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03373861 0.03912592 0.04229856 0.04023004 0.04611397 0.04011154
|
|
0.03928065 0.04038763 0.04050183 0.04010868]
|
|
|
|
mean value: 0.04018974304199219
|
|
|
|
key: score_time
|
|
value: [0.0198133 0.01117086 0.01123762 0.02080536 0.02091765 0.01118398
|
|
0.02124166 0.02203465 0.01984 0.02217436]
|
|
|
|
mean value: 0.01804194450378418
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.37796447 0.33928571 0.56407607 0.76376262 0.73214286
|
|
0.37796447 0.87287156 0.64465837 0.46428571]
|
|
|
|
mean value: 0.5911608523782237
|
|
|
|
key: train_mcc
|
|
value: [0.94117647 0.95598573 0.98550418 0.95630861 0.94160273 0.97080136
|
|
0.97080136 0.97080136 0.97080136 0.94201665]
|
|
|
|
mean value: 0.9605799824099576
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.6875 0.66666667 0.73333333 0.86666667 0.86666667
|
|
0.66666667 0.93333333 0.8 0.73333333]
|
|
|
|
mean value: 0.7829166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.97058824 0.97794118 0.99270073 0.97810219 0.97080292 0.98540146
|
|
0.98540146 0.98540146 0.98540146 0.97080292]
|
|
|
|
mean value: 0.9802544010304852
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.66666667 0.66666667 0.77777778 0.875 0.85714286
|
|
0.61538462 0.94117647 0.84210526 0.75 ]
|
|
|
|
mean value: 0.7880809206273602
|
|
|
|
key: train_fscore
|
|
value: [0.97058824 0.97810219 0.99280576 0.97810219 0.97101449 0.98550725
|
|
0.98529412 0.98529412 0.98529412 0.97101449]
|
|
|
|
mean value: 0.980301695507708
|
|
|
|
key: test_precision
|
|
value: [0.8 0.71428571 0.625 0.63636364 0.77777778 0.85714286
|
|
0.8 0.88888889 0.72727273 0.75 ]
|
|
|
|
mean value: 0.7576731601731602
|
|
|
|
key: train_precision
|
|
value: [0.97058824 0.97101449 0.98571429 0.98529412 0.97101449 0.98550725
|
|
0.98529412 0.98529412 0.98529412 0.95714286]
|
|
|
|
mean value: 0.9782158080623554
|
|
|
|
key: test_recall
|
|
value: [1. 0.625 0.71428571 1. 1. 0.85714286
|
|
0.5 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8446428571428571
|
|
|
|
key: train_recall
|
|
value: [0.97058824 0.98529412 1. 0.97101449 0.97101449 0.98550725
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.982459505541347
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.6875 0.66964286 0.75 0.875 0.86607143
|
|
0.67857143 0.92857143 0.78571429 0.73214286]
|
|
|
|
mean value: 0.7848214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.97058824 0.97794118 0.99264706 0.97815431 0.97080136 0.98540068
|
|
0.98540068 0.98540068 0.98540068 0.97090793]
|
|
|
|
mean value: 0.9802642796248935
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.5 0.5 0.63636364 0.77777778 0.75
|
|
0.44444444 0.88888889 0.72727273 0.6 ]
|
|
|
|
mean value: 0.6624747474747474
|
|
|
|
key: train_jcc
|
|
value: [0.94285714 0.95714286 0.98571429 0.95714286 0.94366197 0.97142857
|
|
0.97101449 0.97101449 0.97101449 0.94366197]
|
|
|
|
mean value: 0.9614653136208555
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09781337 0.10118818 0.09096408 0.09063625 0.08830929 0.08863807
|
|
0.1010282 0.0922606 0.0915482 0.09096527]
|
|
|
|
mean value: 0.09333515167236328
|
|
|
|
key: score_time
|
|
value: [0.00950933 0.00844288 0.00881338 0.00852418 0.00897932 0.00888801
|
|
0.00875974 0.00871754 0.00904465 0.00866079]
|
|
|
|
mean value: 0.008833980560302735
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 1. 1. 0.73214286 0.73214286
|
|
0.87287156 0.87287156 0.73214286 1. ]
|
|
|
|
mean value: 0.8706005900692904
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 1. 1. 0.86666667 0.86666667
|
|
0.93333333 0.93333333 0.86666667 1. ]
|
|
|
|
mean value: 0.9341666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 1. 1. 0.85714286 0.85714286
|
|
0.94117647 0.94117647 0.875 1. ]
|
|
|
|
mean value: 0.9353991596638656
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 1. 1. 0.85714286 0.85714286
|
|
0.88888889 0.88888889 0.875 1. ]
|
|
|
|
mean value: 0.914484126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.85714286 0.85714286
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 1. 1. 0.86607143 0.86607143
|
|
0.92857143 0.92857143 0.86607143 1. ]
|
|
|
|
mean value: 0.9330357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 1. 1. 0.75 0.75
|
|
0.88888889 0.88888889 0.77777778 1. ]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00983596 0.01095533 0.01153588 0.01127434 0.01293206 0.01323128
|
|
0.01175475 0.01181364 0.01138139 0.01201797]
|
|
|
|
mean value: 0.011673259735107421
|
|
|
|
key: score_time
|
|
value: [0.01050639 0.01042032 0.01051211 0.01094747 0.01169777 0.01332498
|
|
0.01089931 0.01096082 0.01095772 0.01388907]
|
|
|
|
mean value: 0.011411595344543456
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.62994079 0.64465837 0.64465837 0.6000992 0.34247476
|
|
0.46770717 0.49099025 0.33928571 0.66143783]
|
|
|
|
mean value: 0.5571252457078674
|
|
|
|
key: train_mcc
|
|
value: [0.84051051 0.92737353 0.90259957 0.80073303 0.88938138 0.71739374
|
|
0.94318882 0.82498207 0.90246052 0.92944673]
|
|
|
|
mean value: 0.8678069912939567
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.8125 0.8 0.8 0.8 0.66666667
|
|
0.66666667 0.73333333 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7620833333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.91911765 0.96323529 0.94890511 0.89051095 0.94160584 0.83941606
|
|
0.97080292 0.90510949 0.94890511 0.96350365]
|
|
|
|
mean value: 0.9291112065264062
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.82352941 0.72727273 0.72727273 0.76923077 0.54545455
|
|
0.54545455 0.71428571 0.66666667 0.76923077]
|
|
|
|
mean value: 0.7163397876633171
|
|
|
|
key: train_fscore
|
|
value: [0.91603053 0.96240602 0.94656489 0.87804878 0.93846154 0.81034483
|
|
0.96969697 0.89430894 0.94573643 0.96183206]
|
|
|
|
mean value: 0.9223430989384103
|
|
|
|
key: test_precision
|
|
value: [0.875 0.77777778 1. 1. 0.83333333 0.75
|
|
1. 0.83333333 0.71428571 1. ]
|
|
|
|
mean value: 0.8783730158730159
|
|
|
|
key: train_precision
|
|
value: [0.95238095 0.98461538 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9936996336996337
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.57142857 0.57142857 0.71428571 0.42857143
|
|
0.375 0.625 0.625 0.625 ]
|
|
|
|
mean value: 0.6285714285714286
|
|
|
|
key: train_recall
|
|
value: [0.88235294 0.94117647 0.89855072 0.7826087 0.88405797 0.68115942
|
|
0.94117647 0.80882353 0.89705882 0.92647059]
|
|
|
|
mean value: 0.8643435635123615
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.8125 0.78571429 0.78571429 0.79464286 0.65178571
|
|
0.6875 0.74107143 0.66964286 0.8125 ]
|
|
|
|
mean value: 0.7616071428571428
|
|
|
|
key: train_roc_auc
|
|
value: [0.91911765 0.96323529 0.94927536 0.89130435 0.94202899 0.84057971
|
|
0.97058824 0.90441176 0.94852941 0.96323529]
|
|
|
|
mean value: 0.9292306052855925
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.7 0.57142857 0.57142857 0.625 0.375
|
|
0.375 0.55555556 0.5 0.625 ]
|
|
|
|
mean value: 0.5676190476190476
|
|
|
|
key: train_jcc
|
|
value: [0.84507042 0.92753623 0.89855072 0.7826087 0.88405797 0.68115942
|
|
0.94117647 0.80882353 0.89705882 0.92647059]
|
|
|
|
mean value: 0.8592512877778178
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01431322 0.01028609 0.0085125 0.00834036 0.00857472 0.00834465
|
|
0.00830102 0.00752926 0.0077374 0.00805783]
|
|
|
|
mean value: 0.00899970531463623
|
|
|
|
key: score_time
|
|
value: [0.01112556 0.00929952 0.00890088 0.00855279 0.0085485 0.00859904
|
|
0.00831628 0.00797892 0.00823665 0.00807309]
|
|
|
|
mean value: 0.00876312255859375
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.66143783 1. 0.875 0.73214286
|
|
0.6000992 1. 0.75592895 0.6000992 ]
|
|
|
|
mean value: 0.7736565919262326
|
|
|
|
key: train_mcc
|
|
value: [0.86849267 0.89715584 0.89791134 0.88355744 0.88355744 0.89863497
|
|
0.85440207 0.85440207 0.89791134 0.86948194]
|
|
|
|
mean value: 0.8805507116446566
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.8 1. 0.93333333 0.86666667
|
|
0.8 1. 0.86666667 0.8 ]
|
|
|
|
mean value: 0.8816666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.93382353 0.94852941 0.94890511 0.94160584 0.94160584 0.94890511
|
|
0.9270073 0.9270073 0.94890511 0.93430657]
|
|
|
|
mean value: 0.9400601116358952
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.82352941 1. 0.93333333 0.85714286
|
|
0.82352941 1. 0.88888889 0.82352941]
|
|
|
|
mean value: 0.8906816059757237
|
|
|
|
key: train_fscore
|
|
value: [0.9352518 0.94890511 0.94890511 0.94285714 0.94285714 0.95035461
|
|
0.92753623 0.92753623 0.94890511 0.9352518 ]
|
|
|
|
mean value: 0.9408360285000935
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.7 1. 0.875 0.85714286
|
|
0.77777778 1. 0.8 0.77777778]
|
|
|
|
mean value: 0.856547619047619
|
|
|
|
key: train_precision
|
|
value: [0.91549296 0.94202899 0.95588235 0.92957746 0.92957746 0.93055556
|
|
0.91428571 0.91428571 0.94202899 0.91549296]
|
|
|
|
mean value: 0.9289208153153076
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 1. 1. 0.85714286
|
|
0.875 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_recall
|
|
value: [0.95588235 0.95588235 0.94202899 0.95652174 0.95652174 0.97101449
|
|
0.94117647 0.94117647 0.95588235 0.95588235]
|
|
|
|
mean value: 0.9531969309462915
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.8125 1. 0.9375 0.86607143
|
|
0.79464286 1. 0.85714286 0.79464286]
|
|
|
|
mean value: 0.88125
|
|
|
|
key: train_roc_auc
|
|
value: [0.93382353 0.94852941 0.94895567 0.94149616 0.94149616 0.94874254
|
|
0.92710997 0.92710997 0.94895567 0.93446292]
|
|
|
|
mean value: 0.940068201193521
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.7 1. 0.875 0.75 0.7 1. 0.8 0.7 ]
|
|
|
|
mean value: 0.8099999999999999
|
|
|
|
key: train_jcc
|
|
value: [0.87837838 0.90277778 0.90277778 0.89189189 0.89189189 0.90540541
|
|
0.86486486 0.86486486 0.90277778 0.87837838]
|
|
|
|
mean value: 0.888400900900901
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:143: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:146: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.07313299 0.06227994 0.06231952 0.06033921 0.06083584 0.06107545
|
|
0.06096387 0.06110859 0.06186771 0.06140947]
|
|
|
|
mean value: 0.06253325939178467
|
|
|
|
key: score_time
|
|
value: [0.00833368 0.00824118 0.00828338 0.00820613 0.00824714 0.00827527
|
|
0.00827289 0.00825977 0.00888276 0.00831437]
|
|
|
|
mean value: 0.008331656455993652
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.66143783 1. 0.875 0.73214286
|
|
0.75592895 1. 0.75592895 0.6000992 ]
|
|
|
|
mean value: 0.7892395667131802
|
|
|
|
key: train_mcc
|
|
value: [0.86849267 0.89715584 0.8978896 0.89863497 0.88355744 0.92709446
|
|
0.89869927 0.85440207 0.92710997 0.87099729]
|
|
|
|
mean value: 0.8924033569902855
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.8 1. 0.93333333 0.86666667
|
|
0.86666667 1. 0.86666667 0.8 ]
|
|
|
|
mean value: 0.8883333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.93382353 0.94852941 0.94890511 0.94890511 0.94160584 0.96350365
|
|
0.94890511 0.9270073 0.96350365 0.93430657]
|
|
|
|
mean value: 0.9458995276942894
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.82352941 1. 0.93333333 0.85714286
|
|
0.88888889 1. 0.88888889 0.82352941]
|
|
|
|
mean value: 0.8972175536881419
|
|
|
|
key: train_fscore
|
|
value: [0.9352518 0.94890511 0.94964029 0.95035461 0.94285714 0.96402878
|
|
0.94964029 0.92753623 0.96350365 0.93617021]
|
|
|
|
mean value: 0.946788810763946
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.7 1. 0.875 0.85714286
|
|
0.8 1. 0.8 0.77777778]
|
|
|
|
mean value: 0.8587698412698412
|
|
|
|
key: train_precision
|
|
value: [0.91549296 0.94202899 0.94285714 0.93055556 0.92957746 0.95714286
|
|
0.92957746 0.91428571 0.95652174 0.90410959]
|
|
|
|
mean value: 0.932214947084399
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [0.95588235 0.95588235 0.95652174 0.97101449 0.95652174 0.97101449
|
|
0.97058824 0.94117647 0.97058824 0.97058824]
|
|
|
|
mean value: 0.9619778346121057
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.8125 1. 0.9375 0.86607143
|
|
0.85714286 1. 0.85714286 0.79464286]
|
|
|
|
mean value: 0.8875000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [0.93382353 0.94852941 0.9488491 0.94874254 0.94149616 0.96344842
|
|
0.94906223 0.92710997 0.96355499 0.93456948]
|
|
|
|
mean value: 0.9459185848252345
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.7 1. 0.875 0.75 0.8 1. 0.8 0.7 ]
|
|
|
|
mean value: 0.82
|
|
|
|
key: train_jcc
|
|
value: [0.87837838 0.90277778 0.90410959 0.90540541 0.89189189 0.93055556
|
|
0.90410959 0.86486486 0.92957746 0.88 ]
|
|
|
|
mean value: 0.8991670516744799
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01616359 0.01377511 0.01260805 0.01199722 0.01317811 0.01211739
|
|
0.01303506 0.01296759 0.01235151 0.01292706]
|
|
|
|
mean value: 0.013112068176269531
|
|
|
|
key: score_time
|
|
value: [0.01072264 0.00871634 0.00817013 0.00809884 0.00805712 0.0079968
|
|
0.00806427 0.00803137 0.00803447 0.00811362]
|
|
|
|
mean value: 0.008400559425354004
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.5 0.37796447 0.73214286 0.87287156 0.60714286
|
|
0.60714286 0.60714286 0.64465837 0.6000992 ]
|
|
|
|
mean value: 0.6431082135582106
|
|
|
|
key: train_mcc
|
|
value: [0.77949606 0.80961181 0.82629176 0.78182997 0.81031543 0.82480818
|
|
0.75186529 0.81092683 0.82614456 0.79560955]
|
|
|
|
mean value: 0.8016899442942331
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.75 0.66666667 0.86666667 0.93333333 0.8
|
|
0.8 0.8 0.8 0.8 ]
|
|
|
|
mean value: 0.8154166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.88970588 0.90441176 0.91240876 0.89051095 0.90510949 0.91240876
|
|
0.87591241 0.90510949 0.91240876 0.89781022]
|
|
|
|
mean value: 0.9005796479175612
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.75 0.70588235 0.85714286 0.92307692 0.8
|
|
0.8 0.8 0.84210526 0.82352941]
|
|
|
|
mean value: 0.823507014141689
|
|
|
|
key: train_fscore
|
|
value: [0.88888889 0.90225564 0.91044776 0.88888889 0.90510949 0.91304348
|
|
0.87407407 0.90225564 0.90909091 0.89705882]
|
|
|
|
mean value: 0.8991113591173656
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.6 0.85714286 1. 0.75
|
|
0.85714286 0.85714286 0.72727273 0.77777778]
|
|
|
|
mean value: 0.8176479076479076
|
|
|
|
key: train_precision
|
|
value: [0.89552239 0.92307692 0.93846154 0.90909091 0.91176471 0.91304348
|
|
0.88059701 0.92307692 0.9375 0.89705882]
|
|
|
|
mean value: 0.9129192704364003
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.88235294 0.88235294 0.88405797 0.86956522 0.89855072 0.91304348
|
|
0.86764706 0.88235294 0.88235294 0.89705882]
|
|
|
|
mean value: 0.8859335038363171
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.75 0.67857143 0.86607143 0.92857143 0.80357143
|
|
0.80357143 0.80357143 0.78571429 0.79464286]
|
|
|
|
mean value: 0.8151785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.88970588 0.90441176 0.91261722 0.89066496 0.90515772 0.91240409
|
|
0.87585251 0.90494459 0.91219096 0.89780477]
|
|
|
|
mean value: 0.9005754475703325
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.6 0.54545455 0.75 0.85714286 0.66666667
|
|
0.66666667 0.66666667 0.72727273 0.7 ]
|
|
|
|
mean value: 0.705487012987013
|
|
|
|
key: train_jcc
|
|
value: [0.8 0.82191781 0.83561644 0.8 0.82666667 0.84
|
|
0.77631579 0.82191781 0.83333333 0.81333333]
|
|
|
|
mean value: 0.8169101177601538
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.37280297 0.37843227 0.38014102 0.37847543 0.37922144 0.38759017
|
|
0.37933087 0.39223146 0.38670659 0.38647294]
|
|
|
|
mean value: 0.38214051723480225
|
|
|
|
key: score_time
|
|
value: [0.0084753 0.00828695 0.00884271 0.00918055 0.00932026 0.00898337
|
|
0.00927162 0.00885415 0.00936317 0.00934863]
|
|
|
|
mean value: 0.008992671966552734
|
|
|
|
key: test_mcc
|
|
value: [1. 0.77459667 0.66143783 0.76376262 0.73214286 0.60714286
|
|
0.75592895 0.87287156 0.75592895 0.6000992 ]
|
|
|
|
mean value: 0.7523911478249176
|
|
|
|
key: train_mcc
|
|
value: [0.94158382 1. 0.95629932 0.94199209 0.95629932 0.98550418
|
|
0.95713391 1. 1. 1. ]
|
|
|
|
mean value: 0.9738812635764046
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.8 0.86666667 0.86666667 0.8
|
|
0.86666667 0.93333333 0.86666667 0.8 ]
|
|
|
|
mean value: 0.8675
|
|
|
|
key: train_accuracy
|
|
value: [0.97058824 1. 0.97810219 0.97080292 0.97810219 0.99270073
|
|
0.97810219 1. 1. 1. ]
|
|
|
|
mean value: 0.986839845427222
|
|
|
|
key: test_fscore
|
|
value: [1. 0.88888889 0.82352941 0.875 0.85714286 0.8
|
|
0.88888889 0.94117647 0.88888889 0.82352941]
|
|
|
|
mean value: 0.8787044817927171
|
|
|
|
key: train_fscore
|
|
value: [0.97101449 1. 0.97841727 0.97142857 0.97841727 0.99280576
|
|
0.97841727 1. 1. 1. ]
|
|
|
|
mean value: 0.9870500618139029
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.7 0.77777778 0.85714286 0.75
|
|
0.8 0.88888889 0.8 0.77777778]
|
|
|
|
mean value: 0.8151587301587302
|
|
|
|
key: train_precision
|
|
value: [0.95714286 1. 0.97142857 0.95774648 0.97142857 0.98571429
|
|
0.95774648 1. 1. 1. ]
|
|
|
|
mean value: 0.9801207243460764
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [0.98529412 1. 0.98550725 0.98550725 0.98550725 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9941815856777494
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.8125 0.875 0.86607143 0.80357143
|
|
0.85714286 0.92857143 0.85714286 0.79464286]
|
|
|
|
mean value: 0.8669642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.97058824 1. 0.97804774 0.9706948 0.97804774 0.99264706
|
|
0.97826087 1. 1. 1. ]
|
|
|
|
mean value: 0.9868286445012788
|
|
|
|
key: test_jcc
|
|
value: [1. 0.8 0.7 0.77777778 0.75 0.66666667
|
|
0.8 0.88888889 0.8 0.7 ]
|
|
|
|
mean value: 0.7883333333333333
|
|
|
|
key: train_jcc
|
|
value: [0.94366197 1. 0.95774648 0.94444444 0.95774648 0.98571429
|
|
0.95774648 1. 1. 1. ]
|
|
|
|
mean value: 0.9747060138609435
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00959182 0.00908065 0.00727654 0.0070405 0.0074923 0.00699615
|
|
0.00736642 0.00702 0.00749421 0.0074172 ]
|
|
|
|
mean value: 0.0076775789260864254
|
|
|
|
key: score_time
|
|
value: [0.01065612 0.01025677 0.00826311 0.0082767 0.00856185 0.00839043
|
|
0.00823283 0.00838685 0.00851226 0.00863767]
|
|
|
|
mean value: 0.008817458152770996
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.37796447 0.49099025 0.37796447 0.21821789 0.49099025
|
|
0.18898224 0.46428571 0.64465837 0.20044593]
|
|
|
|
mean value: 0.42290962650028463
|
|
|
|
key: train_mcc
|
|
value: [0.57208135 0.54899485 0.52400868 0.47754676 0.56162481 0.60455208
|
|
0.60096088 0.6802431 0.57604541 0.66161034]
|
|
|
|
mean value: 0.5807668254236807
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.6875 0.73333333 0.66666667 0.6 0.73333333
|
|
0.6 0.73333333 0.8 0.6 ]
|
|
|
|
mean value: 0.7029166666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.77205882 0.76470588 0.74452555 0.72992701 0.76642336 0.79562044
|
|
0.78832117 0.83211679 0.77372263 0.81751825]
|
|
|
|
mean value: 0.7784939888364105
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.66666667 0.75 0.70588235 0.625 0.75
|
|
0.66666667 0.75 0.84210526 0.7 ]
|
|
|
|
mean value: 0.7345209838321294
|
|
|
|
key: train_fscore
|
|
value: [0.80254777 0.79220779 0.78527607 0.76433121 0.8 0.77419355
|
|
0.81290323 0.80991736 0.80254777 0.83870968]
|
|
|
|
mean value: 0.7982634424404584
|
|
|
|
key: test_precision
|
|
value: [0.8 0.71428571 0.66666667 0.6 0.55555556 0.66666667
|
|
0.6 0.75 0.72727273 0.58333333]
|
|
|
|
mean value: 0.6663780663780664
|
|
|
|
key: train_precision
|
|
value: [0.70786517 0.70930233 0.68085106 0.68181818 0.7032967 0.87272727
|
|
0.72413793 0.9245283 0.70786517 0.74712644]
|
|
|
|
mean value: 0.7459518554034876
|
|
|
|
key: test_recall
|
|
value: [1. 0.625 0.85714286 0.85714286 0.71428571 0.85714286
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.89705882 0.92753623 0.86956522 0.92753623 0.69565217
|
|
0.92647059 0.72058824 0.92647059 0.95588235]
|
|
|
|
mean value: 0.8773231031543052
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.6875 0.74107143 0.67857143 0.60714286 0.74107143
|
|
0.58928571 0.73214286 0.78571429 0.58035714]
|
|
|
|
mean value: 0.7017857142857142
|
|
|
|
key: train_roc_auc
|
|
value: [0.77205882 0.76470588 0.74317988 0.72890026 0.7652387 0.7963555
|
|
0.78932225 0.83130861 0.7748295 0.81852089]
|
|
|
|
mean value: 0.7784420289855073
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.5 0.6 0.54545455 0.45454545 0.6
|
|
0.5 0.6 0.72727273 0.53846154]
|
|
|
|
mean value: 0.5865734265734266
|
|
|
|
key: train_jcc
|
|
value: [0.67021277 0.65591398 0.64646465 0.6185567 0.66666667 0.63157895
|
|
0.68478261 0.68055556 0.67021277 0.72222222]
|
|
|
|
mean value: 0.6647166858413609
|
|
|
|
MCC on Blind test: 0.02
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00780797 0.00748897 0.00779843 0.00788808 0.007725 0.00771189
|
|
0.00735712 0.00735378 0.0075953 0.00707674]
|
|
|
|
mean value: 0.007580327987670899
|
|
|
|
key: score_time
|
|
value: [0.00868344 0.00823379 0.00862813 0.00852036 0.00870013 0.00808978
|
|
0.00818658 0.0080483 0.00845337 0.0080812 ]
|
|
|
|
mean value: 0.008362507820129395
|
|
|
|
key: test_mcc
|
|
value: [0.25 0.25819889 0.07142857 0.33928571 0.46428571 0.13363062
|
|
0.33928571 0.46428571 0.33928571 0.49099025]
|
|
|
|
mean value: 0.3150676906591499
|
|
|
|
key: train_mcc
|
|
value: [0.48788604 0.49441323 0.48933032 0.47900717 0.52059257 0.46076782
|
|
0.4312221 0.41698711 0.44522592 0.43208129]
|
|
|
|
mean value: 0.46575135687893415
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.625 0.53333333 0.66666667 0.73333333 0.53333333
|
|
0.66666667 0.73333333 0.66666667 0.73333333]
|
|
|
|
mean value: 0.6516666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.74264706 0.74264706 0.74452555 0.73722628 0.75912409 0.72992701
|
|
0.71532847 0.7080292 0.72262774 0.71532847]
|
|
|
|
mean value: 0.7317410905968227
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.57142857 0.53333333 0.66666667 0.71428571 0.63157895
|
|
0.66666667 0.75 0.66666667 0.71428571]
|
|
|
|
mean value: 0.6539912280701754
|
|
|
|
key: train_fscore
|
|
value: [0.75524476 0.76510067 0.75177305 0.75675676 0.77241379 0.74125874
|
|
0.71942446 0.71428571 0.72058824 0.72340426]
|
|
|
|
mean value: 0.7420250432480666
|
|
|
|
key: test_precision
|
|
value: [0.625 0.66666667 0.5 0.625 0.71428571 0.5
|
|
0.71428571 0.75 0.71428571 0.83333333]
|
|
|
|
mean value: 0.6642857142857143
|
|
|
|
key: train_precision
|
|
value: [0.72 0.7037037 0.73611111 0.70886076 0.73684211 0.71621622
|
|
0.70422535 0.69444444 0.72058824 0.69863014]
|
|
|
|
mean value: 0.7139622064625399
|
|
|
|
key: test_recall
|
|
value: [0.625 0.5 0.57142857 0.71428571 0.71428571 0.85714286
|
|
0.625 0.75 0.625 0.625 ]
|
|
|
|
mean value: 0.6607142857142857
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.83823529 0.76811594 0.8115942 0.8115942 0.76811594
|
|
0.73529412 0.73529412 0.72058824 0.75 ]
|
|
|
|
mean value: 0.7732949701619778
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.625 0.53571429 0.66964286 0.73214286 0.55357143
|
|
0.66964286 0.73214286 0.66964286 0.74107143]
|
|
|
|
mean value: 0.6553571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.74264706 0.74264706 0.74435209 0.73667945 0.75873828 0.72964621
|
|
0.71547315 0.70822677 0.72261296 0.71557971]
|
|
|
|
mean value: 0.7316602728047741
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.4 0.36363636 0.5 0.55555556 0.46153846
|
|
0.5 0.6 0.5 0.55555556]
|
|
|
|
mean value: 0.4890831390831391
|
|
|
|
key: train_jcc
|
|
value: [0.60674157 0.61956522 0.60227273 0.60869565 0.62921348 0.58888889
|
|
0.56179775 0.55555556 0.56321839 0.56666667]
|
|
|
|
mean value: 0.5902615907742418
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00691152 0.00645924 0.00683975 0.00714922 0.00726128 0.00734305
|
|
0.00735426 0.00752354 0.00730586 0.00730991]
|
|
|
|
mean value: 0.0071457624435424805
|
|
|
|
key: score_time
|
|
value: [0.00938153 0.00884461 0.00903273 0.0093987 0.00931835 0.00955343
|
|
0.00963545 0.0096395 0.00957155 0.00971961]
|
|
|
|
mean value: 0.009409546852111816
|
|
|
|
key: test_mcc
|
|
value: [ 0.62994079 0.5 0.49099025 0.6000992 0.49099025 0.32732684
|
|
-0.02620712 0.46428571 0.32732684 0.32732684]
|
|
|
|
mean value: 0.4132079591989289
|
|
|
|
key: train_mcc
|
|
value: [0.69731096 0.6918501 0.75815907 0.66971076 0.69510727 0.70910029
|
|
0.6523446 0.71313464 0.68163703 0.66616982]
|
|
|
|
mean value: 0.6934524542628495
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.75 0.73333333 0.8 0.73333333 0.66666667
|
|
0.46666667 0.73333333 0.66666667 0.66666667]
|
|
|
|
mean value: 0.7029166666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.84558824 0.84558824 0.87591241 0.83211679 0.84671533 0.8540146
|
|
0.82481752 0.8540146 0.83941606 0.83211679]
|
|
|
|
mean value: 0.8450300558179475
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.75 0.75 0.76923077 0.75 0.61538462
|
|
0.2 0.75 0.70588235 0.70588235]
|
|
|
|
mean value: 0.6819909502262443
|
|
|
|
key: train_fscore
|
|
value: [0.85517241 0.84892086 0.88435374 0.84353741 0.85314685 0.85915493
|
|
0.83098592 0.86111111 0.84507042 0.83687943]
|
|
|
|
mean value: 0.8518333098052753
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.75 0.66666667 0.83333333 0.66666667 0.66666667
|
|
0.5 0.75 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6944444444444444
|
|
|
|
key: train_precision
|
|
value: [0.80519481 0.83098592 0.83333333 0.79487179 0.82432432 0.83561644
|
|
0.7972973 0.81578947 0.81081081 0.80821918]
|
|
|
|
mean value: 0.815644337144789
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 0.71428571 0.85714286 0.57142857
|
|
0.125 0.75 0.75 0.75 ]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.91176471 0.86764706 0.94202899 0.89855072 0.88405797 0.88405797
|
|
0.86764706 0.91176471 0.88235294 0.86764706]
|
|
|
|
mean value: 0.8917519181585678
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.75 0.74107143 0.79464286 0.74107143 0.66071429
|
|
0.49107143 0.73214286 0.66071429 0.66071429]
|
|
|
|
mean value: 0.7044642857142858
|
|
|
|
key: train_roc_auc
|
|
value: [0.84558824 0.84558824 0.87542626 0.8316283 0.84644075 0.85379369
|
|
0.82512788 0.85443308 0.8397272 0.83237425]
|
|
|
|
mean value: 0.8450127877237852
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.6 0.6 0.625 0.6 0.44444444
|
|
0.11111111 0.6 0.54545455 0.54545455]
|
|
|
|
mean value: 0.5371464646464646
|
|
|
|
key: train_jcc
|
|
value: [0.74698795 0.7375 0.79268293 0.72941176 0.74390244 0.75308642
|
|
0.71084337 0.75609756 0.73170732 0.7195122 ]
|
|
|
|
mean value: 0.7421731948784563
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00869989 0.00870991 0.00899339 0.00884175 0.00880337 0.00869131
|
|
0.00819468 0.00880289 0.00880551 0.00904202]
|
|
|
|
mean value: 0.008758473396301269
|
|
|
|
key: score_time
|
|
value: [0.00901675 0.00891495 0.00872993 0.00879502 0.00886464 0.00874734
|
|
0.00867009 0.00894928 0.00889111 0.00890279]
|
|
|
|
mean value: 0.008848190307617188
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.62994079 0.49099025 0.76376262 0.60714286 0.60714286
|
|
0.76376262 0.60714286 0.75592895 0.73214286]
|
|
|
|
mean value: 0.6707956647621525
|
|
|
|
key: train_mcc
|
|
value: [0.85294118 0.88235294 0.86868474 0.81027501 0.8687127 0.85434012
|
|
0.89869927 0.89869927 0.89863497 0.85440207]
|
|
|
|
mean value: 0.8687742253690149
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.8125 0.73333333 0.86666667 0.8 0.8
|
|
0.86666667 0.8 0.86666667 0.86666667]
|
|
|
|
mean value: 0.82875
|
|
|
|
key: train_accuracy
|
|
value: [0.92647059 0.94117647 0.93430657 0.90510949 0.93430657 0.9270073
|
|
0.94890511 0.94890511 0.94890511 0.9270073 ]
|
|
|
|
mean value: 0.9342099613568055
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.8 0.75 0.875 0.8 0.8
|
|
0.85714286 0.8 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8321031746031746
|
|
|
|
key: train_fscore
|
|
value: [0.92647059 0.94117647 0.9352518 0.90647482 0.93430657 0.92857143
|
|
0.94964029 0.94964029 0.94736842 0.92753623]
|
|
|
|
mean value: 0.9346436903919317
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 0.66666667 0.77777778 0.75 0.75
|
|
1. 0.85714286 0.8 0.875 ]
|
|
|
|
mean value: 0.8208730158730159
|
|
|
|
key: train_precision
|
|
value: [0.92647059 0.94117647 0.92857143 0.9 0.94117647 0.91549296
|
|
0.92957746 0.92957746 0.96923077 0.91428571]
|
|
|
|
mean value: 0.929555932882362
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 1. 0.85714286 0.85714286
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.94117647 0.94202899 0.91304348 0.92753623 0.94202899
|
|
0.97058824 0.97058824 0.92647059 0.94117647]
|
|
|
|
mean value: 0.9401108269394715
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.8125 0.74107143 0.875 0.80357143 0.80357143
|
|
0.875 0.80357143 0.85714286 0.86607143]
|
|
|
|
mean value: 0.83125
|
|
|
|
key: train_roc_auc
|
|
value: [0.92647059 0.94117647 0.93424979 0.90505115 0.93435635 0.92689685
|
|
0.94906223 0.94906223 0.94874254 0.92710997]
|
|
|
|
mean value: 0.9342178175618073
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.66666667 0.6 0.77777778 0.66666667 0.66666667
|
|
0.75 0.66666667 0.8 0.77777778]
|
|
|
|
mean value: 0.715
|
|
|
|
key: train_jcc
|
|
value: [0.8630137 0.88888889 0.87837838 0.82894737 0.87671233 0.86666667
|
|
0.90410959 0.90410959 0.9 0.86486486]
|
|
|
|
mean value: 0.8775691372699304
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.47774863 0.47219229 0.59577179 0.48569918 0.47681928 0.47980237
|
|
0.52681231 0.59629416 0.46796393 0.48238492]
|
|
|
|
mean value: 0.506148886680603
|
|
|
|
key: score_time
|
|
value: [0.01098704 0.01345611 0.01107907 0.01333833 0.01326942 0.01334047
|
|
0.01134348 0.01388001 0.01111221 0.01353312]
|
|
|
|
mean value: 0.012533926963806152
|
|
|
|
key: test_mcc
|
|
value: [1. 0.77459667 0.37796447 0.60714286 0.76376262 0.60714286
|
|
0.46428571 0.60714286 0.75592895 0.73214286]
|
|
|
|
mean value: 0.6690109846952281
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.66666667 0.8 0.86666667 0.8
|
|
0.73333333 0.8 0.86666667 0.86666667]
|
|
|
|
mean value: 0.8275
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.88888889 0.70588235 0.8 0.875 0.8
|
|
0.75 0.8 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8383660130718954
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.6 0.75 0.77777778 0.75
|
|
0.75 0.85714286 0.8 0.875 ]
|
|
|
|
mean value: 0.7959920634920635
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.85714286 0.85714286 1. 0.85714286
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8946428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.67857143 0.80357143 0.875 0.80357143
|
|
0.73214286 0.80357143 0.85714286 0.86607143]
|
|
|
|
mean value: 0.8294642857142858
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.8 0.54545455 0.66666667 0.77777778 0.66666667
|
|
0.6 0.66666667 0.8 0.77777778]
|
|
|
|
mean value: 0.7301010101010101
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01036239 0.00926948 0.00805068 0.00823069 0.00814795 0.00783706
|
|
0.00812101 0.00809956 0.00801969 0.00834656]
|
|
|
|
mean value: 0.008448505401611328
|
|
|
|
key: score_time
|
|
value: [0.01827431 0.00897479 0.00929928 0.00881314 0.00856185 0.00861549
|
|
0.00854754 0.00869775 0.00881147 0.00856495]
|
|
|
|
mean value: 0.009716057777404785
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 1. 1. 0.875 0.87287156
|
|
0.87287156 0.75592895 0.87287156 0.875 ]
|
|
|
|
mean value: 0.9006460732538559
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 1. 1. 0.93333333 0.93333333
|
|
0.93333333 0.86666667 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9470833333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 1. 1. 0.93333333 0.92307692
|
|
0.94117647 0.88888889 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9502161890397185
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 1. 1. 0.875 1.
|
|
0.88888889 0.8 0.88888889 1. ]
|
|
|
|
mean value: 0.9341666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 1. 1. 0.9375 0.92857143
|
|
0.92857143 0.85714286 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9455357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 1. 1. 0.875 0.85714286
|
|
0.88888889 0.8 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9073809523809524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08768535 0.08900571 0.08186841 0.087538 0.08398795 0.08419561
|
|
0.083009 0.08226275 0.08182836 0.0807085 ]
|
|
|
|
mean value: 0.08420896530151367
|
|
|
|
key: score_time
|
|
value: [0.01827073 0.01797938 0.01798153 0.01794004 0.01733184 0.01720476
|
|
0.01783872 0.01791549 0.017555 0.01719594]
|
|
|
|
mean value: 0.017721343040466308
|
|
|
|
key: test_mcc
|
|
value: [1. 0.75 0.73214286 1. 0.875 0.73214286
|
|
0.60714286 0.76376262 0.87287156 0.76376262]
|
|
|
|
mean value: 0.8096825364024487
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.86666667 1. 0.93333333 0.86666667
|
|
0.8 0.86666667 0.93333333 0.86666667]
|
|
|
|
mean value: 0.9008333333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.875 0.85714286 1. 0.93333333 0.85714286
|
|
0.8 0.85714286 0.94117647 0.85714286]
|
|
|
|
mean value: 0.8978081232492997
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.85714286 1. 0.875 0.85714286
|
|
0.85714286 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.921031746031746
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 1. 1. 0.85714286
|
|
0.75 0.75 1. 0.75 ]
|
|
|
|
mean value: 0.8839285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.86607143 1. 0.9375 0.86607143
|
|
0.80357143 0.875 0.92857143 0.875 ]
|
|
|
|
mean value: 0.9026785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.77777778 0.75 1. 0.875 0.75
|
|
0.66666667 0.75 0.88888889 0.75 ]
|
|
|
|
mean value: 0.8208333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00700188 0.00755453 0.00768661 0.00771928 0.00732183 0.00702024
|
|
0.00708175 0.00722766 0.00722528 0.00710702]
|
|
|
|
mean value: 0.007294607162475586
|
|
|
|
key: score_time
|
|
value: [0.00804567 0.00840735 0.0089283 0.0083468 0.0084908 0.00807238
|
|
0.00812387 0.0082643 0.00818753 0.00800323]
|
|
|
|
mean value: 0.00828702449798584
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 0.73214286 1. 0.76376262 0.46428571
|
|
0.49099025 0.60714286 0.875 0.87287156]
|
|
|
|
mean value: 0.7570030065748747
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 0.86666667 1. 0.86666667 0.73333333
|
|
0.73333333 0.8 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8741666666666666
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.93333333 0.85714286 1. 0.875 0.71428571
|
|
0.71428571 0.8 0.93333333 0.94117647]
|
|
|
|
mean value: 0.8709733893557423
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.85714286 1. 0.77777778 0.71428571
|
|
0.83333333 0.85714286 1. 0.88888889]
|
|
|
|
mean value: 0.8817460317460317
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 1. 1. 0.71428571
|
|
0.625 0.75 0.875 1. ]
|
|
|
|
mean value: 0.8696428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 0.86607143 1. 0.875 0.73214286
|
|
0.74107143 0.80357143 0.9375 0.92857143]
|
|
|
|
mean value: 0.8758928571428571
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.875 0.75 1. 0.77777778 0.55555556
|
|
0.55555556 0.66666667 0.875 0.88888889]
|
|
|
|
mean value: 0.7833333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.0097611 1.00532484 1.01266861 1.00928307 1.0162096 1.00374866
|
|
1.01397824 1.01551723 1.02490425 1.04691529]
|
|
|
|
mean value: 1.0158310890197755
|
|
|
|
key: score_time
|
|
value: [0.15017748 0.09301543 0.09229612 0.09591055 0.09012294 0.09085989
|
|
0.09002423 0.09411788 0.09723639 0.09498525]
|
|
|
|
mean value: 0.09887461662292481
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 0.76376262 1. 0.875 0.73214286
|
|
0.60714286 0.73214286 0.87287156 0.875 ]
|
|
|
|
mean value: 0.8339979851886711
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 0.86666667 1. 0.93333333 0.86666667
|
|
0.8 0.86666667 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9137500000000001
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 0.875 1. 0.93333333 0.85714286
|
|
0.8 0.875 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9156162464985994
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.77777778 1. 0.875 0.85714286
|
|
0.85714286 0.875 0.88888889 1. ]
|
|
|
|
mean value: 0.901984126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
0.75 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 0.875 1. 0.9375 0.86607143
|
|
0.80357143 0.86607143 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9151785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 0.77777778 1. 0.875 0.75
|
|
0.66666667 0.77777778 0.88888889 0.875 ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.84915781 0.96325994 0.88296103 0.89262009 0.85590529 0.8617866
|
|
0.86219358 0.88421559 0.87171268 0.90677834]
|
|
|
|
mean value: 0.8830590963363647
|
|
|
|
key: score_time
|
|
value: [0.23183656 0.20871425 0.23059011 0.22569108 0.22029448 0.24542952
|
|
0.22994447 0.24367285 0.24643469 0.23650432]
|
|
|
|
mean value: 0.23191123008728026
|
|
|
|
key: test_mcc
|
|
value: [1. 0.75 0.76376262 1. 0.73214286 0.60714286
|
|
0.60714286 0.73214286 0.87287156 0.875 ]
|
|
|
|
mean value: 0.794020560534137
|
|
|
|
key: train_mcc
|
|
value: [0.98540068 0.94117647 0.98550418 0.97120941 0.94160273 0.98550418
|
|
0.98550725 0.98550725 0.97122151 0.97122151]
|
|
|
|
mean value: 0.9723855158091337
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.86666667 1. 0.86666667 0.8
|
|
0.8 0.86666667 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8941666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.99264706 0.97058824 0.99270073 0.98540146 0.97080292 0.99270073
|
|
0.99270073 0.99270073 0.98540146 0.98540146]
|
|
|
|
mean value: 0.986104551309575
|
|
|
|
key: test_fscore
|
|
value: [1. 0.875 0.875 1. 0.85714286 0.8
|
|
0.8 0.875 0.94117647 0.93333333]
|
|
|
|
mean value: 0.8956652661064426
|
|
|
|
key: train_fscore
|
|
value: [0.99270073 0.97058824 0.99280576 0.98571429 0.97101449 0.99280576
|
|
0.99270073 0.99270073 0.98550725 0.98550725]
|
|
|
|
mean value: 0.9862045207088039
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.77777778 1. 0.85714286 0.75
|
|
0.85714286 0.875 0.88888889 1. ]
|
|
|
|
mean value: 0.888095238095238
|
|
|
|
key: train_precision
|
|
value: [0.98550725 0.97058824 0.98571429 0.97183099 0.97101449 0.98571429
|
|
0.98550725 0.98550725 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9784241167379383
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 1. 1. 0.85714286 0.85714286
|
|
0.75 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.9089285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 0.97058824 1. 1. 0.97101449 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.994160272804774
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.875 1. 0.86607143 0.80357143
|
|
0.80357143 0.86607143 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.8955357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.99264706 0.97058824 0.99264706 0.98529412 0.97080136 0.99264706
|
|
0.99275362 0.99275362 0.98550725 0.98550725]
|
|
|
|
mean value: 0.986114663256607
|
|
|
|
key: test_jcc
|
|
value: [1. 0.77777778 0.77777778 1. 0.75 0.66666667
|
|
0.66666667 0.77777778 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8180555555555555
|
|
|
|
key: train_jcc
|
|
value: [0.98550725 0.94285714 0.98571429 0.97183099 0.94366197 0.98571429
|
|
0.98550725 0.98550725 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9729157554019771
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01659274 0.00693059 0.00675678 0.0067389 0.00669909 0.00674319
|
|
0.00680494 0.00672269 0.00675416 0.00671124]
|
|
|
|
mean value: 0.007745432853698731
|
|
|
|
key: score_time
|
|
value: [0.01080561 0.00839496 0.00837088 0.00777602 0.00775194 0.00776267
|
|
0.00777245 0.00776482 0.00776577 0.00774288]
|
|
|
|
mean value: 0.008190798759460449
|
|
|
|
key: test_mcc
|
|
value: [0.25 0.25819889 0.07142857 0.33928571 0.46428571 0.13363062
|
|
0.33928571 0.46428571 0.33928571 0.49099025]
|
|
|
|
mean value: 0.3150676906591499
|
|
|
|
key: train_mcc
|
|
value: [0.48788604 0.49441323 0.48933032 0.47900717 0.52059257 0.46076782
|
|
0.4312221 0.41698711 0.44522592 0.43208129]
|
|
|
|
mean value: 0.46575135687893415
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.625 0.53333333 0.66666667 0.73333333 0.53333333
|
|
0.66666667 0.73333333 0.66666667 0.73333333]
|
|
|
|
mean value: 0.6516666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.74264706 0.74264706 0.74452555 0.73722628 0.75912409 0.72992701
|
|
0.71532847 0.7080292 0.72262774 0.71532847]
|
|
|
|
mean value: 0.7317410905968227
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.57142857 0.53333333 0.66666667 0.71428571 0.63157895
|
|
0.66666667 0.75 0.66666667 0.71428571]
|
|
|
|
mean value: 0.6539912280701754
|
|
|
|
key: train_fscore
|
|
value: [0.75524476 0.76510067 0.75177305 0.75675676 0.77241379 0.74125874
|
|
0.71942446 0.71428571 0.72058824 0.72340426]
|
|
|
|
mean value: 0.7420250432480666
|
|
|
|
key: test_precision
|
|
value: [0.625 0.66666667 0.5 0.625 0.71428571 0.5
|
|
0.71428571 0.75 0.71428571 0.83333333]
|
|
|
|
mean value: 0.6642857142857143
|
|
|
|
key: train_precision
|
|
value: [0.72 0.7037037 0.73611111 0.70886076 0.73684211 0.71621622
|
|
0.70422535 0.69444444 0.72058824 0.69863014]
|
|
|
|
mean value: 0.7139622064625399
|
|
|
|
key: test_recall
|
|
value: [0.625 0.5 0.57142857 0.71428571 0.71428571 0.85714286
|
|
0.625 0.75 0.625 0.625 ]
|
|
|
|
mean value: 0.6607142857142857
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.83823529 0.76811594 0.8115942 0.8115942 0.76811594
|
|
0.73529412 0.73529412 0.72058824 0.75 ]
|
|
|
|
mean value: 0.7732949701619778
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.625 0.53571429 0.66964286 0.73214286 0.55357143
|
|
0.66964286 0.73214286 0.66964286 0.74107143]
|
|
|
|
mean value: 0.6553571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.74264706 0.74264706 0.74435209 0.73667945 0.75873828 0.72964621
|
|
0.71547315 0.70822677 0.72261296 0.71557971]
|
|
|
|
mean value: 0.7316602728047741
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.4 0.36363636 0.5 0.55555556 0.46153846
|
|
0.5 0.6 0.5 0.55555556]
|
|
|
|
mean value: 0.4890831390831391
|
|
|
|
key: train_jcc
|
|
value: [0.60674157 0.61956522 0.60227273 0.60869565 0.62921348 0.58888889
|
|
0.56179775 0.55555556 0.56321839 0.56666667]
|
|
|
|
mean value: 0.5902615907742418
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.05043268 0.03555036 0.05953455 0.03447628 0.03820968 0.0348382
|
|
0.03491855 0.03489041 0.03481722 0.03513861]
|
|
|
|
mean value: 0.03928065299987793
|
|
|
|
key: score_time
|
|
value: [0.01032662 0.01029825 0.0103786 0.01031733 0.01061702 0.01034212
|
|
0.01034379 0.01036835 0.01033378 0.01031613]
|
|
|
|
mean value: 0.010364198684692382
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 1. 1. 0.875 1.
|
|
0.87287156 1. 0.87287156 0.875 ]
|
|
|
|
mean value: 0.9377660225576135
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 1. 1. 0.93333333 1.
|
|
0.93333333 1. 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9670833333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 1. 1. 0.93333333 1.
|
|
0.94117647 1. 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9690196078431372
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 1. 1. 0.875 1.
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9541666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.9875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 1. 1. 0.9375 1.
|
|
0.92857143 1. 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9669642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 1. 1. 0.875 1.
|
|
0.88888889 1. 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9416666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01003599 0.01160932 0.01194644 0.01213264 0.0120163 0.01201224
|
|
0.01192141 0.01221108 0.01222897 0.01220608]
|
|
|
|
mean value: 0.011832046508789062
|
|
|
|
key: score_time
|
|
value: [0.01034403 0.01014495 0.01055765 0.01064253 0.01057649 0.01061487
|
|
0.01058221 0.01066971 0.01062822 0.01060128]
|
|
|
|
mean value: 0.01053619384765625
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.77459667 0.49099025 1. 0.73214286 0.73214286
|
|
0.87287156 0.76376262 0.75592895 0.75592895]
|
|
|
|
mean value: 0.7760281809053229
|
|
|
|
key: train_mcc
|
|
value: [0.89949371 0.91533482 0.90246052 0.91392776 0.92787101 0.95710706
|
|
0.9139999 0.92951942 0.92791659 0.92951942]
|
|
|
|
mean value: 0.9217150203470457
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.73333333 1. 0.86666667 0.86666667
|
|
0.93333333 0.86666667 0.86666667 0.86666667]
|
|
|
|
mean value: 0.88125
|
|
|
|
key: train_accuracy
|
|
value: [0.94852941 0.95588235 0.94890511 0.95620438 0.96350365 0.97810219
|
|
0.95620438 0.96350365 0.96350365 0.96350365]
|
|
|
|
mean value: 0.9597842421640189
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.88888889 0.75 1. 0.85714286 0.85714286
|
|
0.94117647 0.85714286 0.88888889 0.88888889]
|
|
|
|
mean value: 0.8870448179271708
|
|
|
|
key: train_fscore
|
|
value: [0.95035461 0.95774648 0.95172414 0.95774648 0.96453901 0.9787234
|
|
0.95714286 0.96453901 0.96402878 0.96453901]
|
|
|
|
mean value: 0.961108376525978
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.8 0.66666667 1. 0.85714286 0.85714286
|
|
0.88888889 1. 0.8 0.8 ]
|
|
|
|
mean value: 0.8558730158730159
|
|
|
|
key: train_precision
|
|
value: [0.91780822 0.91891892 0.90789474 0.93150685 0.94444444 0.95833333
|
|
0.93055556 0.93150685 0.94366197 0.93150685]
|
|
|
|
mean value: 0.9316137728048631
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.85714286 1. 0.85714286 0.85714286
|
|
1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.9321428571428572
|
|
|
|
key: train_recall
|
|
value: [0.98529412 1. 1. 0.98550725 0.98550725 1.
|
|
0.98529412 1. 0.98529412 1. ]
|
|
|
|
mean value: 0.99268968456948
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.74107143 1. 0.86607143 0.86607143
|
|
0.92857143 0.875 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8803571428571428
|
|
|
|
key: train_roc_auc
|
|
value: [0.94852941 0.95588235 0.94852941 0.95598892 0.96334186 0.97794118
|
|
0.95641517 0.96376812 0.96366155 0.96376812]
|
|
|
|
mean value: 0.9597826086956522
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.8 0.6 1. 0.75 0.75
|
|
0.88888889 0.75 0.8 0.8 ]
|
|
|
|
mean value: 0.8027777777777778
|
|
|
|
key: train_jcc
|
|
value: [0.90540541 0.91891892 0.90789474 0.91891892 0.93150685 0.95833333
|
|
0.91780822 0.93150685 0.93055556 0.93150685]
|
|
|
|
mean value: 0.9252355636097525
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.64
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00948811 0.00723052 0.00743365 0.00692177 0.00683284 0.00688004
|
|
0.00742602 0.00687981 0.00724554 0.00696445]
|
|
|
|
mean value: 0.00733027458190918
|
|
|
|
key: score_time
|
|
value: [0.01050973 0.00838256 0.00804853 0.00779343 0.00792456 0.00785303
|
|
0.00853586 0.00783825 0.00832796 0.00790906]
|
|
|
|
mean value: 0.008312296867370606
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.25819889 0.37796447 0.32732684 0.60714286 0.37796447
|
|
0.49099025 0.33928571 0.33928571 0.46428571]
|
|
|
|
mean value: 0.3960409397159814
|
|
|
|
key: train_mcc
|
|
value: [0.47243088 0.54894692 0.5182264 0.47592003 0.46076782 0.5335339
|
|
0.4599318 0.4312221 0.47473887 0.47442455]
|
|
|
|
mean value: 0.4850143267959903
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.625 0.66666667 0.66666667 0.8 0.66666667
|
|
0.73333333 0.66666667 0.66666667 0.73333333]
|
|
|
|
mean value: 0.6912499999999999
|
|
|
|
key: train_accuracy
|
|
value: [0.73529412 0.77205882 0.75912409 0.73722628 0.72992701 0.76642336
|
|
0.72992701 0.71532847 0.73722628 0.73722628]
|
|
|
|
mean value: 0.7419761700300558
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.57142857 0.70588235 0.61538462 0.8 0.70588235
|
|
0.71428571 0.66666667 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6862863606981253
|
|
|
|
key: train_fscore
|
|
value: [0.74647887 0.7862069 0.76258993 0.75 0.74125874 0.77464789
|
|
0.72992701 0.71942446 0.73913043 0.73529412]
|
|
|
|
mean value: 0.7484958346591992
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.66666667 0.6 0.66666667 0.75 0.6
|
|
0.83333333 0.71428571 0.71428571 0.75 ]
|
|
|
|
mean value: 0.700952380952381
|
|
|
|
key: train_precision
|
|
value: [0.71621622 0.74025974 0.75714286 0.72 0.71621622 0.75342466
|
|
0.72463768 0.70422535 0.72857143 0.73529412]
|
|
|
|
mean value: 0.729598826685986
|
|
|
|
key: test_recall
|
|
value: [0.625 0.5 0.85714286 0.57142857 0.85714286 0.85714286
|
|
0.625 0.625 0.625 0.75 ]
|
|
|
|
mean value: 0.6892857142857143
|
|
|
|
key: train_recall
|
|
value: [0.77941176 0.83823529 0.76811594 0.7826087 0.76811594 0.79710145
|
|
0.73529412 0.73529412 0.75 0.73529412]
|
|
|
|
mean value: 0.7689471440750213
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.625 0.67857143 0.66071429 0.80357143 0.67857143
|
|
0.74107143 0.66964286 0.66964286 0.73214286]
|
|
|
|
mean value: 0.6946428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.73529412 0.77205882 0.75905797 0.73689258 0.72964621 0.76619778
|
|
0.7299659 0.71547315 0.73731884 0.73721228]
|
|
|
|
mean value: 0.7419117647058824
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.4 0.54545455 0.44444444 0.66666667 0.54545455
|
|
0.55555556 0.5 0.5 0.6 ]
|
|
|
|
mean value: 0.5257575757575758
|
|
|
|
key: train_jcc
|
|
value: [0.59550562 0.64772727 0.61627907 0.6 0.58888889 0.63218391
|
|
0.57471264 0.56179775 0.5862069 0.58139535]
|
|
|
|
mean value: 0.5984697399283192
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00826144 0.00782967 0.00737524 0.00810552 0.00831509 0.00756001
|
|
0.00788832 0.00767446 0.00802422 0.00808358]
|
|
|
|
mean value: 0.007911753654479981
|
|
|
|
key: score_time
|
|
value: [0.00887156 0.00873542 0.00787902 0.00805974 0.00852108 0.00807381
|
|
0.00853586 0.00868702 0.00864601 0.00846553]
|
|
|
|
mean value: 0.008447504043579102
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.5 0.47245559 0.64465837 0.73214286 0.60714286
|
|
0.64465837 0.87287156 0.64465837 0.6000992 ]
|
|
|
|
mean value: 0.6493283847542592
|
|
|
|
key: train_mcc
|
|
value: [0.76894131 0.91334626 0.54803747 0.87326937 0.94160273 0.83757093
|
|
0.91597649 0.88476385 0.87099729 0.88476385]
|
|
|
|
mean value: 0.8439269536443883
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.75 0.73333333 0.8 0.86666667 0.8
|
|
0.8 0.93333333 0.8 0.8 ]
|
|
|
|
mean value: 0.8158333333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.875 0.95588235 0.72992701 0.93430657 0.97080292 0.91240876
|
|
0.95620438 0.94160584 0.93430657 0.94160584]
|
|
|
|
mean value: 0.9152050236152856
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.75 0.66666667 0.72727273 0.85714286 0.8
|
|
0.84210526 0.94117647 0.84210526 0.82352941]
|
|
|
|
mean value: 0.8107141516893839
|
|
|
|
key: train_fscore
|
|
value: [0.85950413 0.95454545 0.63366337 0.93129771 0.97101449 0.92
|
|
0.95774648 0.94285714 0.93617021 0.94285714]
|
|
|
|
mean value: 0.9049656133144264
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.8 1. 0.85714286 0.75
|
|
0.72727273 0.88888889 0.72727273 0.77777778]
|
|
|
|
mean value: 0.8278354978354978
|
|
|
|
key: train_precision
|
|
value: [0.98113208 0.984375 1. 0.98387097 0.97101449 0.85185185
|
|
0.91891892 0.91666667 0.90410959 0.91666667]
|
|
|
|
mean value: 0.9428606229112457
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.57142857 0.57142857 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8232142857142857
|
|
|
|
key: train_recall
|
|
value: [0.76470588 0.92647059 0.46376812 0.88405797 0.97101449 1.
|
|
1. 0.97058824 0.97058824 0.97058824]
|
|
|
|
mean value: 0.8921781756180733
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.75 0.72321429 0.78571429 0.86607143 0.80357143
|
|
0.78571429 0.92857143 0.78571429 0.79464286]
|
|
|
|
mean value: 0.8098214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.875 0.95588235 0.73188406 0.93467604 0.97080136 0.91176471
|
|
0.95652174 0.94181586 0.93456948 0.94181586]
|
|
|
|
mean value: 0.9154731457800511
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.6 0.5 0.57142857 0.75 0.66666667
|
|
0.72727273 0.88888889 0.72727273 0.7 ]
|
|
|
|
mean value: 0.6881529581529582
|
|
|
|
key: train_jcc
|
|
value: [0.75362319 0.91304348 0.46376812 0.87142857 0.94366197 0.85185185
|
|
0.91891892 0.89189189 0.88 0.89189189]
|
|
|
|
mean value: 0.8380079880422807
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00997114 0.01002455 0.00783682 0.00783896 0.00782728 0.00736952
|
|
0.00789356 0.00743914 0.00764585 0.0072844 ]
|
|
|
|
mean value: 0.00811312198638916
|
|
|
|
key: score_time
|
|
value: [0.01067495 0.00957513 0.00809073 0.00825882 0.0082314 0.00791526
|
|
0.00799417 0.00835299 0.00847554 0.00832438]
|
|
|
|
mean value: 0.008589339256286622
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.37796447 0.36689969 0.60714286 0.49099025 0.73214286
|
|
0.6000992 0.73214286 0.75592895 0.73214286]
|
|
|
|
mean value: 0.6170050660873226
|
|
|
|
key: train_mcc
|
|
value: [0.72669793 0.88580789 0.78788403 0.74493056 0.77817796 0.91597649
|
|
0.92951942 0.85434012 0.86000692 0.91240409]
|
|
|
|
mean value: 0.8395745411348854
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.6875 0.6 0.8 0.73333333 0.86666667
|
|
0.8 0.86666667 0.86666667 0.86666667]
|
|
|
|
mean value: 0.79625
|
|
|
|
key: train_accuracy
|
|
value: [0.84558824 0.94117647 0.88321168 0.86861314 0.88321168 0.95620438
|
|
0.96350365 0.9270073 0.9270073 0.95620438]
|
|
|
|
mean value: 0.9151728209531989
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.66666667 0.7 0.8 0.75 0.85714286
|
|
0.82352941 0.875 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8093370681605976
|
|
|
|
key: train_fscore
|
|
value: [0.8173913 0.93846154 0.8961039 0.87837838 0.89333333 0.95454545
|
|
0.96453901 0.92537313 0.93055556 0.95588235]
|
|
|
|
mean value: 0.9154563955087716
|
|
|
|
key: test_precision
|
|
value: [1. 0.71428571 0.53846154 0.75 0.66666667 0.85714286
|
|
0.77777778 0.875 0.8 0.875 ]
|
|
|
|
mean value: 0.7854334554334554
|
|
|
|
key: train_precision
|
|
value: [1. 0.98387097 0.81176471 0.82278481 0.82716049 1.
|
|
0.93150685 0.93939394 0.88157895 0.95588235]
|
|
|
|
mean value: 0.9153943066596637
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 1. 0.85714286 0.85714286 0.85714286
|
|
0.875 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.69117647 0.89705882 1. 0.94202899 0.97101449 0.91304348
|
|
1. 0.91176471 0.98529412 0.95588235]
|
|
|
|
mean value: 0.9267263427109974
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.6875 0.625 0.80357143 0.74107143 0.86607143
|
|
0.79464286 0.86607143 0.85714286 0.86607143]
|
|
|
|
mean value: 0.7982142857142858
|
|
|
|
key: train_roc_auc
|
|
value: [0.84558824 0.94117647 0.88235294 0.86807332 0.88256607 0.95652174
|
|
0.96376812 0.92689685 0.92742967 0.95620205]
|
|
|
|
mean value: 0.9150575447570333
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.5 0.53846154 0.66666667 0.6 0.75
|
|
0.7 0.77777778 0.8 0.77777778]
|
|
|
|
mean value: 0.6860683760683761
|
|
|
|
key: train_jcc
|
|
value: [0.69117647 0.88405797 0.81176471 0.78313253 0.80722892 0.91304348
|
|
0.93150685 0.86111111 0.87012987 0.91549296]
|
|
|
|
mean value: 0.8468644859831611
|
|
|
|
MCC on Blind test: 0.04
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07661414 0.06280541 0.0673337 0.06278038 0.06312895 0.06578207
|
|
0.06545353 0.06469059 0.06454372 0.06606507]
|
|
|
|
mean value: 0.06591975688934326
|
|
|
|
key: score_time
|
|
value: [0.01440525 0.01476049 0.01512003 0.01427507 0.01462126 0.01519179
|
|
0.01491117 0.01524901 0.01458573 0.01450872]
|
|
|
|
mean value: 0.01476285457611084
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 0.76376262 0.875 0.73214286 0.87287156
|
|
0.87287156 1. 0.87287156 0.73214286]
|
|
|
|
mean value: 0.8603580116631793
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 0.86666667 0.93333333 0.86666667 0.93333333
|
|
0.93333333 1. 0.93333333 0.86666667]
|
|
|
|
mean value: 0.9270833333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 0.875 0.93333333 0.85714286 0.92307692
|
|
0.94117647 1. 0.94117647 0.875 ]
|
|
|
|
mean value: 0.9287082525317819
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.77777778 0.875 0.85714286 1.
|
|
0.88888889 1. 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9051587301587302
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 0.875 0.9375 0.86607143 0.92857143
|
|
0.92857143 1. 0.92857143 0.86607143]
|
|
|
|
mean value: 0.9267857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 0.77777778 0.875 0.75 0.85714286
|
|
0.88888889 1. 0.88888889 0.77777778]
|
|
|
|
mean value: 0.870436507936508
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02507782 0.02816606 0.04579067 0.0453043 0.04246235 0.02354765
|
|
0.0451479 0.02337193 0.02406144 0.03010583]
|
|
|
|
mean value: 0.0333035945892334
|
|
|
|
key: score_time
|
|
value: [0.0173595 0.02051401 0.03634977 0.03514004 0.01607704 0.03413272
|
|
0.02627468 0.01672935 0.02149177 0.03459334]
|
|
|
|
mean value: 0.025866222381591798
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 1. 1. 0.875 0.73214286
|
|
0.87287156 1. 0.87287156 0.875 ]
|
|
|
|
mean value: 0.9109803082718992
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9985507246376811
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 1. 1. 0.93333333 0.86666667
|
|
0.93333333 1. 0.93333333 0.93333333]
|
|
|
|
mean value: 0.95375
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.99270073 1. 1. ]
|
|
|
|
mean value: 0.9992700729927008
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 1. 1. 0.93333333 0.85714286
|
|
0.94117647 1. 0.94117647 0.93333333]
|
|
|
|
mean value: 0.954733893557423
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.99270073 1. 1. ]
|
|
|
|
mean value: 0.9992700729927008
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 1. 1. 0.875 0.85714286
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9398809523809524
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9985507246376811
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 1. 1. 0.9375 0.86607143
|
|
0.92857143 1. 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9535714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.99275362 1. 1. ]
|
|
|
|
mean value: 0.9992753623188406
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 1. 1. 0.875 0.75
|
|
0.88888889 1. 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9166666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9985507246376811
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03250933 0.03897357 0.01707721 0.01700234 0.01710153 0.01721501
|
|
0.04007459 0.0396409 0.03977489 0.04000473]
|
|
|
|
mean value: 0.02993741035461426
|
|
|
|
key: score_time
|
|
value: [0.01946115 0.01910114 0.01107264 0.01099253 0.01101065 0.01094913
|
|
0.02085972 0.01942563 0.01113796 0.02103782]
|
|
|
|
mean value: 0.015504837036132812
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.37796447 0.73214286 0.76376262 0.73214286
|
|
0.76376262 0.60714286 0.6000992 0.60714286]
|
|
|
|
mean value: 0.6816077435069778
|
|
|
|
key: train_mcc
|
|
value: [0.95598573 0.98540068 0.98550418 0.98550725 0.97080136 0.97080136
|
|
0.98550725 0.97080136 0.97120941 0.97080136]
|
|
|
|
mean value: 0.9752319946905791
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.66666667 0.86666667 0.86666667 0.86666667
|
|
0.86666667 0.8 0.8 0.8 ]
|
|
|
|
mean value: 0.8345833333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.97794118 0.99264706 0.99270073 0.99270073 0.98540146 0.98540146
|
|
0.99270073 0.98540146 0.98540146 0.98540146]
|
|
|
|
mean value: 0.9875697724345213
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.875 0.70588235 0.85714286 0.875 0.85714286
|
|
0.85714286 0.8 0.82352941 0.8 ]
|
|
|
|
mean value: 0.8392016806722689
|
|
|
|
key: train_fscore
|
|
value: [0.97777778 0.99259259 0.99280576 0.99270073 0.98550725 0.98550725
|
|
0.99270073 0.98529412 0.98507463 0.98529412]
|
|
|
|
mean value: 0.9875254940533481
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.875 0.6 0.85714286 0.77777778 0.85714286
|
|
1. 0.85714286 0.77777778 0.85714286]
|
|
|
|
mean value: 0.8348015873015873
|
|
|
|
key: train_precision
|
|
value: [0.98507463 1. 0.98571429 1. 0.98550725 0.98550725
|
|
0.98550725 0.98529412 1. 0.98529412]
|
|
|
|
mean value: 0.989789888700451
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 0.85714286 1. 0.85714286
|
|
0.75 0.75 0.875 0.75 ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.97058824 0.98529412 1. 0.98550725 0.98550725 0.98550725
|
|
1. 0.98529412 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9853580562659847
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.67857143 0.86607143 0.875 0.86607143
|
|
0.875 0.80357143 0.79464286 0.80357143]
|
|
|
|
mean value: 0.8375
|
|
|
|
key: train_roc_auc
|
|
value: [0.97794118 0.99264706 0.99264706 0.99275362 0.98540068 0.98540068
|
|
0.99275362 0.98540068 0.98529412 0.98540068]
|
|
|
|
mean value: 0.9875639386189259
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.77777778 0.54545455 0.75 0.77777778 0.75
|
|
0.75 0.66666667 0.7 0.66666667]
|
|
|
|
mean value: 0.7273232323232323
|
|
|
|
key: train_jcc
|
|
value: [0.95652174 0.98529412 0.98571429 0.98550725 0.97142857 0.97142857
|
|
0.98550725 0.97101449 0.97058824 0.97101449]
|
|
|
|
mean value: 0.9754018998903909
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10531926 0.09862638 0.1012013 0.09502292 0.09023929 0.08958817
|
|
0.09062433 0.07970119 0.08648419 0.07744169]
|
|
|
|
mean value: 0.09142487049102783
|
|
|
|
key: score_time
|
|
value: [0.00943542 0.00918198 0.00938845 0.00950336 0.00970459 0.00936961
|
|
0.00830388 0.00854349 0.00833607 0.00825047]
|
|
|
|
mean value: 0.009001731872558594
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 1. 1. 0.875 0.87287156
|
|
0.87287156 0.87287156 0.87287156 0.875 ]
|
|
|
|
mean value: 0.9005320451152271
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 1. 1. 0.93333333 0.93333333
|
|
0.93333333 0.93333333 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9475
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 1. 1. 0.93333333 0.92307692
|
|
0.94117647 0.94117647 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9495625942684767
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 1. 1. 0.875 1.
|
|
0.88888889 0.88888889 0.88888889 1. ]
|
|
|
|
mean value: 0.9319444444444445
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 1. 1. 0.9375 0.92857143
|
|
0.92857143 0.92857143 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9464285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 1. 1. 0.875 0.85714286
|
|
0.88888889 0.88888889 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9051587301587302
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00895786 0.01093102 0.01075339 0.01098084 0.01611209 0.01137829
|
|
0.01158309 0.01124573 0.01122022 0.01167703]
|
|
|
|
mean value: 0.01148395538330078
|
|
|
|
key: score_time
|
|
value: [0.01024771 0.01018643 0.01022196 0.01059794 0.010957 0.01306653
|
|
0.01069069 0.01379347 0.01385617 0.01327443]
|
|
|
|
mean value: 0.011689233779907226
|
|
|
|
key: test_mcc
|
|
value: [1. 0.67419986 0.75592895 0.75592895 0.75592895 0.53452248
|
|
0.56407607 0.60714286 0.76376262 0.76376262]
|
|
|
|
mean value: 0.7175253347956024
|
|
|
|
key: train_mcc
|
|
value: [0.98540068 1. 1. 1. 1. 1.
|
|
0.87609014 1. 1. 1. ]
|
|
|
|
mean value: 0.9861490818102587
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.8125 0.86666667 0.86666667 0.86666667 0.73333333
|
|
0.73333333 0.8 0.86666667 0.86666667]
|
|
|
|
mean value: 0.84125
|
|
|
|
key: train_accuracy
|
|
value: [0.99264706 1. 1. 1. 1. 1.
|
|
0.93430657 1. 1. 1. ]
|
|
|
|
mean value: 0.9926953628166595
|
|
|
|
key: test_fscore
|
|
value: [1. 0.76923077 0.83333333 0.83333333 0.83333333 0.6
|
|
0.66666667 0.8 0.85714286 0.85714286]
|
|
|
|
mean value: 0.805018315018315
|
|
|
|
key: train_fscore
|
|
value: [0.99259259 1. 1. 1. 1. 1.
|
|
0.92913386 1. 1. 1. ]
|
|
|
|
mean value: 0.9921726450860309
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.625 0.71428571 0.71428571 0.71428571 0.42857143
|
|
0.5 0.75 0.75 0.75 ]
|
|
|
|
mean value: 0.6946428571428571
|
|
|
|
key: train_recall
|
|
value: [0.98529412 1. 1. 1. 1. 1.
|
|
0.86764706 1. 1. 1. ]
|
|
|
|
mean value: 0.9852941176470589
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.8125 0.85714286 0.85714286 0.85714286 0.71428571
|
|
0.75 0.80357143 0.875 0.875 ]
|
|
|
|
mean value: 0.8401785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.99264706 1. 1. 1. 1. 1.
|
|
0.93382353 1. 1. 1. ]
|
|
|
|
mean value: 0.9926470588235294
|
|
|
|
key: test_jcc
|
|
value: [1. 0.625 0.71428571 0.71428571 0.71428571 0.42857143
|
|
0.5 0.66666667 0.75 0.75 ]
|
|
|
|
mean value: 0.6863095238095238
|
|
|
|
key: train_jcc
|
|
value: [0.98529412 1. 1. 1. 1. 1.
|
|
0.86764706 1. 1. 1. ]
|
|
|
|
mean value: 0.9852941176470589
|
|
|
|
MCC on Blind test: -0.02
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01463962 0.00981808 0.00781131 0.00768018 0.00760412 0.00761986
|
|
0.00743604 0.00750971 0.00748968 0.00746846]
|
|
|
|
mean value: 0.008507704734802246
|
|
|
|
key: score_time
|
|
value: [0.01040816 0.0082798 0.00810122 0.00809526 0.00800514 0.0080471
|
|
0.00783396 0.00794506 0.00791073 0.00810385]
|
|
|
|
mean value: 0.008273029327392578
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.37796447 0.87287156 0.73214286 0.73214286
|
|
0.75592895 1. 0.75592895 0.6000992 ]
|
|
|
|
mean value: 0.7338936730461708
|
|
|
|
key: train_mcc
|
|
value: [0.82388584 0.88273483 0.85440207 0.85434012 0.89863497 0.88320546
|
|
0.90025835 0.84026462 0.88360693 0.86948194]
|
|
|
|
mean value: 0.8690815123547234
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.66666667 0.93333333 0.86666667 0.86666667
|
|
0.86666667 1. 0.86666667 0.8 ]
|
|
|
|
mean value: 0.8616666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.91176471 0.94117647 0.9270073 0.9270073 0.94890511 0.94160584
|
|
0.94890511 0.91970803 0.94160584 0.93430657]
|
|
|
|
mean value: 0.9341992271361099
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.82352941 0.70588235 0.92307692 0.85714286 0.85714286
|
|
0.88888889 1. 0.88888889 0.82352941]
|
|
|
|
mean value: 0.8701414924944337
|
|
|
|
key: train_fscore
|
|
value: [0.91304348 0.94202899 0.92647059 0.92857143 0.95035461 0.94202899
|
|
0.95035461 0.92086331 0.94202899 0.9352518 ]
|
|
|
|
mean value: 0.9350996779361157
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.6 1. 0.85714286 0.85714286
|
|
0.8 1. 0.8 0.77777778]
|
|
|
|
mean value: 0.846984126984127
|
|
|
|
key: train_precision
|
|
value: [0.9 0.92857143 0.94029851 0.91549296 0.93055556 0.94202899
|
|
0.91780822 0.90140845 0.92857143 0.91549296]
|
|
|
|
mean value: 0.9220228491043612
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 0.85714286 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9053571428571429
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.95588235 0.91304348 0.94202899 0.97101449 0.94202899
|
|
0.98529412 0.94117647 0.95588235 0.95588235]
|
|
|
|
mean value: 0.9488704177323103
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.67857143 0.92857143 0.86607143 0.86607143
|
|
0.85714286 1. 0.85714286 0.79464286]
|
|
|
|
mean value: 0.8598214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.91176471 0.94117647 0.92710997 0.92689685 0.94874254 0.94160273
|
|
0.9491688 0.9198636 0.94170929 0.93446292]
|
|
|
|
mean value: 0.9342497868712702
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.7 0.54545455 0.85714286 0.75 0.75
|
|
0.8 1. 0.8 0.7 ]
|
|
|
|
mean value: 0.7777597402597403
|
|
|
|
key: train_jcc
|
|
value: [0.84 0.89041096 0.8630137 0.86666667 0.90540541 0.89041096
|
|
0.90540541 0.85333333 0.89041096 0.87837838]
|
|
|
|
mean value: 0.8783435764531655
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.07311392 0.06029248 0.06152678 0.05939054 0.05958748 0.05935431
|
|
0.05937529 0.05921316 0.0613966 0.06374431]
|
|
|
|
mean value: 0.06169948577880859
|
|
|
|
key: score_time
|
|
value: [0.00807023 0.00803971 0.00803089 0.00806856 0.00811672 0.00805664
|
|
0.00809479 0.00807714 0.00883532 0.0086627 ]
|
|
|
|
mean value: 0.008205270767211914
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.62994079 0.49099025 0.87287156 0.73214286 0.73214286
|
|
0.75592895 1. 0.75592895 0.6000992 ]
|
|
|
|
mean value: 0.7451962510483463
|
|
|
|
key: train_mcc
|
|
value: [0.85442069 0.87000211 0.89863497 0.85434012 0.92787101 0.91277477
|
|
0.90025835 0.8555278 0.88360693 0.88668406]
|
|
|
|
mean value: 0.8844120809526788
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.73333333 0.93333333 0.86666667 0.86666667
|
|
0.86666667 1. 0.86666667 0.8 ]
|
|
|
|
mean value: 0.8683333333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.92647059 0.93382353 0.94890511 0.9270073 0.96350365 0.95620438
|
|
0.94890511 0.9270073 0.94160584 0.94160584]
|
|
|
|
mean value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:163: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:166: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
0.9415038643194504
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.82352941 0.75 0.92307692 0.85714286 0.85714286
|
|
0.88888889 1. 0.88888889 0.82352941]
|
|
|
|
mean value: 0.8753375709258062
|
|
|
|
key: train_fscore
|
|
value: [0.92857143 0.93617021 0.95035461 0.92857143 0.96453901 0.95714286
|
|
0.95035461 0.92857143 0.94202899 0.94366197]
|
|
|
|
mean value: 0.9429966539911687
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.77777778 0.66666667 1. 0.85714286 0.85714286
|
|
0.8 1. 0.8 0.77777778]
|
|
|
|
mean value: 0.8425396825396825
|
|
|
|
key: train_precision
|
|
value: [0.90277778 0.90410959 0.93055556 0.91549296 0.94444444 0.94366197
|
|
0.91780822 0.90277778 0.92857143 0.90540541]
|
|
|
|
mean value: 0.9195605127329032
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 0.85714286 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9178571428571428
|
|
|
|
key: train_recall
|
|
value: [0.95588235 0.97058824 0.97101449 0.94202899 0.98550725 0.97101449
|
|
0.98529412 0.95588235 0.95588235 0.98529412]
|
|
|
|
mean value: 0.9678388746803069
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.74107143 0.92857143 0.86607143 0.86607143
|
|
0.85714286 1. 0.85714286 0.79464286]
|
|
|
|
mean value: 0.8660714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.92647059 0.93382353 0.94874254 0.92689685 0.96334186 0.95609548
|
|
0.9491688 0.92721654 0.94170929 0.94192242]
|
|
|
|
mean value: 0.941538789428815
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.7 0.6 0.85714286 0.75 0.75
|
|
0.8 1. 0.8 0.7 ]
|
|
|
|
mean value: 0.7846031746031746
|
|
|
|
key: train_jcc
|
|
value: [0.86666667 0.88 0.90540541 0.86666667 0.93150685 0.91780822
|
|
0.90540541 0.86666667 0.89041096 0.89333333]
|
|
|
|
mean value: 0.8923870171541405
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01548505 0.01318932 0.01122952 0.01181602 0.01175117 0.01169848
|
|
0.01112366 0.01132369 0.0109632 0.01237416]
|
|
|
|
mean value: 0.012095427513122559
|
|
|
|
key: score_time
|
|
value: [0.01040673 0.00819731 0.0085001 0.00783849 0.00782514 0.00842071
|
|
0.0079546 0.00785255 0.00782204 0.00846028]
|
|
|
|
mean value: 0.008327794075012208
|
|
|
|
key: test_mcc
|
|
value: [0.35 0.35 0.8 1. 0.79056942 0.8
|
|
0.5 0.5 0.25819889 1. ]
|
|
|
|
mean value: 0.6348768304789256
|
|
|
|
key: train_mcc
|
|
value: [0.87044534 0.87035806 0.87044534 0.81836616 0.81836616 0.84412955
|
|
0.84615385 0.84615385 0.84615385 0.84615385]
|
|
|
|
mean value: 0.8476726003234742
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.66666667 0.88888889 1. 0.88888889 0.88888889
|
|
0.75 0.75 0.625 1. ]
|
|
|
|
mean value: 0.8125
|
|
|
|
key: train_accuracy
|
|
value: [0.93506494 0.93506494 0.93506494 0.90909091 0.90909091 0.92207792
|
|
0.92307692 0.92307692 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9237762237762238
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.88888889 1. 0.90909091 0.88888889
|
|
0.75 0.75 0.57142857 1. ]
|
|
|
|
mean value: 0.8091630591630592
|
|
|
|
key: train_fscore
|
|
value: [0.93506494 0.93670886 0.93506494 0.90666667 0.90666667 0.92105263
|
|
0.92307692 0.92307692 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9233532388109337
|
|
|
|
key: test_precision
|
|
value: [0.6 0.6 0.8 1. 0.83333333 1.
|
|
0.75 0.75 0.66666667 1. ]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_precision
|
|
value: [0.94736842 0.925 0.94736842 0.91891892 0.91891892 0.92105263
|
|
0.92307692 0.92307692 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9270935003829741
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 1. 1. 0.8 0.75 0.75 0.5 1. ]
|
|
|
|
mean value: 0.83
|
|
|
|
key: train_recall
|
|
value: [0.92307692 0.94871795 0.92307692 0.89473684 0.89473684 0.92105263
|
|
0.92307692 0.92307692 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9197705802968961
|
|
|
|
key: test_roc_auc
|
|
value: [0.675 0.675 0.9 1. 0.875 0.9 0.75 0.75 0.625 1. ]
|
|
|
|
mean value: 0.8150000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [0.93522267 0.93488529 0.93522267 0.90890688 0.90890688 0.92206478
|
|
0.92307692 0.92307692 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9237516869095818
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.8 1. 0.83333333 0.8
|
|
0.6 0.6 0.4 1. ]
|
|
|
|
mean value: 0.7033333333333334
|
|
|
|
key: train_jcc
|
|
value: [0.87804878 0.88095238 0.87804878 0.82926829 0.82926829 0.85365854
|
|
0.85714286 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8577816492450638
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.57
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.28448343 0.26222992 0.30907059 0.30741835 0.29078984 0.30970097
|
|
0.2858026 0.2821455 0.30858493 0.30905151]
|
|
|
|
mean value: 0.2949277639389038
|
|
|
|
key: score_time
|
|
value: [0.00840044 0.00826621 0.00892925 0.00815058 0.00974989 0.00875688
|
|
0.00868368 0.00955057 0.00915575 0.00842237]
|
|
|
|
mean value: 0.008806562423706055
|
|
|
|
key: test_mcc
|
|
value: [0.1 0.35 0.8 0.79056942 1. 1.
|
|
1. 0.5 0.57735027 1. ]
|
|
|
|
mean value: 0.711791968423172
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.55555556 0.66666667 0.88888889 0.88888889 1. 1.
|
|
1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.66666667 0.88888889 0.90909091 1. 1.
|
|
1. 0.75 0.66666667 1. ]
|
|
|
|
mean value: 0.8381313131313131
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.6 0.8 0.83333333 1. 1.
|
|
1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.8483333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 1. 1. 1. 1. 1. 0.75 0.5 1. ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.55 0.675 0.9 0.875 1. 1. 1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.5 0.8 0.83333333 1. 1.
|
|
1. 0.6 0.5 1. ]
|
|
|
|
mean value: 0.7566666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0089488 0.00875568 0.0074203 0.00725079 0.00705767 0.00712848
|
|
0.00707436 0.00722599 0.00702119 0.00702357]
|
|
|
|
mean value: 0.007490682601928711
|
|
|
|
key: score_time
|
|
value: [0.0106101 0.01032877 0.00850701 0.00852227 0.0084753 0.00805879
|
|
0.00836706 0.00811815 0.00836396 0.00837588]
|
|
|
|
mean value: 0.008772730827331543
|
|
|
|
key: test_mcc
|
|
value: [ 0.39528471 0.5976143 0.5976143 0.47809144 0.15811388 0.35
|
|
0.25819889 0.37796447 0.25819889 -0.25819889]
|
|
|
|
mean value: 0.3212882006354006
|
|
|
|
key: train_mcc
|
|
value: [0.54521744 0.52542209 0.52542209 0.53924899 0.54085245 0.53924899
|
|
0.52790958 0.54772256 0.58722022 0.60697698]
|
|
|
|
mean value: 0.5485241391056351
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.77777778 0.77777778 0.66666667 0.55555556 0.66666667
|
|
0.625 0.625 0.625 0.375 ]
|
|
|
|
mean value: 0.6361111111111111
|
|
|
|
key: train_accuracy
|
|
value: [0.72727273 0.71428571 0.71428571 0.72727273 0.74025974 0.72727273
|
|
0.71794872 0.73076923 0.75641026 0.76923077]
|
|
|
|
mean value: 0.7325008325008325
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.66666667 0.66666667 0.57142857 0.5 0.66666667
|
|
0.57142857 0.4 0.57142857 0.28571429]
|
|
|
|
mean value: 0.53
|
|
|
|
key: train_fscore
|
|
value: [0.63157895 0.60714286 0.60714286 0.61818182 0.65517241 0.61818182
|
|
0.60714286 0.63157895 0.6779661 0.7 ]
|
|
|
|
mean value: 0.6354088618017069
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 0.66666667 0.75
|
|
0.66666667 1. 0.66666667 0.33333333]
|
|
|
|
mean value: 0.8083333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.95 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.995
|
|
|
|
key: test_recall
|
|
value: [0.25 0.5 0.5 0.4 0.4 0.6 0.5 0.25 0.5 0.25]
|
|
|
|
mean value: 0.415
|
|
|
|
key: train_recall
|
|
value: [0.46153846 0.43589744 0.43589744 0.44736842 0.5 0.44736842
|
|
0.43589744 0.46153846 0.51282051 0.53846154]
|
|
|
|
mean value: 0.4676788124156545
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.75 0.75 0.7 0.575 0.675 0.625 0.625 0.625 0.375]
|
|
|
|
mean value: 0.6325
|
|
|
|
key: train_roc_auc
|
|
value: [0.73076923 0.71794872 0.71794872 0.72368421 0.73717949 0.72368421
|
|
0.71794872 0.73076923 0.75641026 0.76923077]
|
|
|
|
mean value: 0.732557354925776
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.5 0.5 0.4 0.33333333 0.5
|
|
0.4 0.25 0.4 0.16666667]
|
|
|
|
mean value: 0.37
|
|
|
|
key: train_jcc
|
|
value: [0.46153846 0.43589744 0.43589744 0.44736842 0.48717949 0.44736842
|
|
0.43589744 0.46153846 0.51282051 0.53846154]
|
|
|
|
mean value: 0.46639676113360323
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00678349 0.00722671 0.00688362 0.00727487 0.00721765 0.00729203
|
|
0.00733399 0.00746918 0.00723958 0.00735378]
|
|
|
|
mean value: 0.007207489013671875
|
|
|
|
key: score_time
|
|
value: [0.00807953 0.00803757 0.00839972 0.00815082 0.00826144 0.0078733
|
|
0.00901914 0.0084033 0.00845194 0.00837016]
|
|
|
|
mean value: 0.008304691314697266
|
|
|
|
key: test_mcc
|
|
value: [-0.31622777 0.63245553 0.63245553 0.15811388 0.31622777 0.35
|
|
0.57735027 0.77459667 -0.25819889 0. ]
|
|
|
|
mean value: 0.2866772995759719
|
|
|
|
key: train_mcc
|
|
value: [0.53279352 0.50745677 0.5064147 0.45639039 0.53591229 0.42943967
|
|
0.51298918 0.46537892 0.59684919 0.41367015]
|
|
|
|
mean value: 0.49572947743384127
|
|
|
|
key: test_accuracy
|
|
value: [0.33333333 0.77777778 0.77777778 0.55555556 0.66666667 0.66666667
|
|
0.75 0.875 0.375 0.5 ]
|
|
|
|
mean value: 0.6277777777777778
|
|
|
|
key: train_accuracy
|
|
value: [0.76623377 0.75324675 0.75324675 0.72727273 0.76623377 0.71428571
|
|
0.75641026 0.73076923 0.79487179 0.70512821]
|
|
|
|
mean value: 0.7467698967698968
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.8 0.8 0.5 0.72727273 0.66666667
|
|
0.66666667 0.85714286 0.44444444 0.5 ]
|
|
|
|
mean value: 0.6362193362193362
|
|
|
|
key: train_fscore
|
|
value: [0.775 0.7654321 0.75949367 0.73417722 0.775 0.71794872
|
|
0.75949367 0.74698795 0.80952381 0.72289157]
|
|
|
|
mean value: 0.7565948701272275
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.66666667 0.66666667 0.66666667 0.66666667 0.75
|
|
1. 1. 0.4 0.5 ]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_precision
|
|
value: [0.75609756 0.73809524 0.75 0.70731707 0.73809524 0.7
|
|
0.75 0.70454545 0.75555556 0.68181818]
|
|
|
|
mean value: 0.7281524302256009
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 0.4 0.8 0.6 0.5 0.75 0.5 0.5 ]
|
|
|
|
mean value: 0.655
|
|
|
|
key: train_recall
|
|
value: [0.79487179 0.79487179 0.76923077 0.76315789 0.81578947 0.73684211
|
|
0.76923077 0.79487179 0.87179487 0.76923077]
|
|
|
|
mean value: 0.7879892037786774
|
|
|
|
key: test_roc_auc
|
|
value: [0.35 0.8 0.8 0.575 0.65 0.675 0.75 0.875 0.375 0.5 ]
|
|
|
|
mean value: 0.635
|
|
|
|
key: train_roc_auc
|
|
value: [0.76585695 0.75269906 0.75303644 0.72773279 0.7668691 0.7145749
|
|
0.75641026 0.73076923 0.79487179 0.70512821]
|
|
|
|
mean value: 0.7467948717948718
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.66666667 0.66666667 0.33333333 0.57142857 0.5
|
|
0.5 0.75 0.28571429 0.33333333]
|
|
|
|
mean value: 0.4857142857142857
|
|
|
|
key: train_jcc
|
|
value: [0.63265306 0.62 0.6122449 0.58 0.63265306 0.56
|
|
0.6122449 0.59615385 0.68 0.56603774]
|
|
|
|
mean value: 0.609198750037025
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00730515 0.00695133 0.00699258 0.00709629 0.00697374 0.00705719
|
|
0.0069747 0.00722718 0.00697184 0.00727534]
|
|
|
|
mean value: 0.007082533836364746
|
|
|
|
key: score_time
|
|
value: [0.00949669 0.00922227 0.00921607 0.00929904 0.00929213 0.00919104
|
|
0.00917578 0.00927401 0.00928402 0.00928831]
|
|
|
|
mean value: 0.009273934364318847
|
|
|
|
key: test_mcc
|
|
value: [-0.15811388 0.15811388 0.8 0.63245553 0.31622777 0.8
|
|
0.5 0.25819889 0.25819889 0.57735027]
|
|
|
|
mean value: 0.4142431346734462
|
|
|
|
key: train_mcc
|
|
value: [0.58541539 0.66239043 0.61039852 0.61039852 0.55870445 0.61066127
|
|
0.64187021 0.64102564 0.62050523 0.56577895]
|
|
|
|
mean value: 0.6107148606822335
|
|
|
|
key: test_accuracy
|
|
value: [0.44444444 0.55555556 0.88888889 0.77777778 0.66666667 0.88888889
|
|
0.75 0.625 0.625 0.75 ]
|
|
|
|
mean value: 0.6972222222222222
|
|
|
|
key: train_accuracy
|
|
value: [0.79220779 0.83116883 0.80519481 0.80519481 0.77922078 0.80519481
|
|
0.82051282 0.82051282 0.80769231 0.78205128]
|
|
|
|
mean value: 0.804895104895105
|
|
|
|
key: test_fscore
|
|
value: [0.28571429 0.6 0.88888889 0.75 0.72727273 0.88888889
|
|
0.75 0.57142857 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6700288600288601
|
|
|
|
key: train_fscore
|
|
value: [0.78947368 0.83544304 0.81012658 0.8 0.77922078 0.80519481
|
|
0.825 0.82051282 0.81927711 0.79012346]
|
|
|
|
mean value: 0.8074372274615954
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.5 0.8 1. 0.66666667 1.
|
|
0.75 0.66666667 0.66666667 1. ]
|
|
|
|
mean value: 0.7383333333333333
|
|
|
|
key: train_precision
|
|
value: [0.81081081 0.825 0.8 0.81081081 0.76923077 0.79487179
|
|
0.80487805 0.82051282 0.77272727 0.76190476]
|
|
|
|
mean value: 0.7970747089649529
|
|
|
|
key: test_recall
|
|
value: [0.25 0.75 1. 0.6 0.8 0.8 0.75 0.5 0.5 0.5 ]
|
|
|
|
mean value: 0.645
|
|
|
|
key: train_recall
|
|
value: [0.76923077 0.84615385 0.82051282 0.78947368 0.78947368 0.81578947
|
|
0.84615385 0.82051282 0.87179487 0.82051282]
|
|
|
|
mean value: 0.8189608636977058
|
|
|
|
key: test_roc_auc
|
|
value: [0.425 0.575 0.9 0.8 0.65 0.9 0.75 0.625 0.625 0.75 ]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_roc_auc
|
|
value: [0.79251012 0.83097166 0.80499325 0.80499325 0.77935223 0.80533063
|
|
0.82051282 0.82051282 0.80769231 0.78205128]
|
|
|
|
mean value: 0.8048920377867747
|
|
|
|
key: test_jcc
|
|
value: [0.16666667 0.42857143 0.8 0.6 0.57142857 0.8
|
|
0.6 0.4 0.4 0.5 ]
|
|
|
|
mean value: 0.5266666666666666
|
|
|
|
key: train_jcc
|
|
value: [0.65217391 0.7173913 0.68085106 0.66666667 0.63829787 0.67391304
|
|
0.70212766 0.69565217 0.69387755 0.65306122]
|
|
|
|
mean value: 0.6774012472704161
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00880122 0.00814819 0.00756073 0.00777268 0.00736904 0.00765371
|
|
0.00792193 0.00778627 0.00790763 0.00790238]
|
|
|
|
mean value: 0.007882380485534668
|
|
|
|
key: score_time
|
|
value: [0.0086937 0.00912237 0.00853276 0.00825572 0.00864816 0.00843048
|
|
0.0086689 0.00874853 0.00819087 0.00858378]
|
|
|
|
mean value: 0.008587527275085449
|
|
|
|
key: test_mcc
|
|
value: [0.35 0.1 0.8 1. 0.5976143 0.8
|
|
0.77459667 1. 0. 0.77459667]
|
|
|
|
mean value: 0.6196807643150164
|
|
|
|
key: train_mcc
|
|
value: [0.84516739 0.82485566 0.84852502 0.848923 0.79675455 0.87044534
|
|
0.74456944 0.8720816 0.77563153 0.84726867]
|
|
|
|
mean value: 0.8274222215949533
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.55555556 0.88888889 1. 0.77777778 0.88888889
|
|
0.875 1. 0.5 0.875 ]
|
|
|
|
mean value: 0.8027777777777778
|
|
|
|
key: train_accuracy
|
|
value: [0.92207792 0.90909091 0.92207792 0.92207792 0.8961039 0.93506494
|
|
0.87179487 0.93589744 0.88461538 0.92307692]
|
|
|
|
mean value: 0.9121878121878122
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.5 0.88888889 1. 0.83333333 0.88888889
|
|
0.85714286 1. 0.5 0.85714286]
|
|
|
|
mean value: 0.7992063492063491
|
|
|
|
key: train_fscore
|
|
value: [0.925 0.91566265 0.92682927 0.925 0.9 0.93506494
|
|
0.875 0.93670886 0.89156627 0.925 ]
|
|
|
|
mean value: 0.9155831979779763
|
|
|
|
key: test_precision
|
|
value: [0.6 0.5 0.8 1. 0.71428571 1.
|
|
1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.8114285714285714
|
|
|
|
key: train_precision
|
|
value: [0.90243902 0.86363636 0.88372093 0.88095238 0.85714286 0.92307692
|
|
0.85365854 0.925 0.84090909 0.90243902]
|
|
|
|
mean value: 0.8832975131316028
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 1. 1. 1. 0.8 0.75 1. 0.5 0.75]
|
|
|
|
mean value: 0.805
|
|
|
|
key: train_recall
|
|
value: [0.94871795 0.97435897 0.97435897 0.97368421 0.94736842 0.94736842
|
|
0.8974359 0.94871795 0.94871795 0.94871795]
|
|
|
|
mean value: 0.950944669365722
|
|
|
|
key: test_roc_auc
|
|
value: [0.675 0.55 0.9 1. 0.75 0.9 0.875 1. 0.5 0.875]
|
|
|
|
mean value: 0.8025
|
|
|
|
key: train_roc_auc
|
|
value: [0.9217274 0.90823212 0.92139001 0.92273954 0.89676113 0.93522267
|
|
0.87179487 0.93589744 0.88461538 0.92307692]
|
|
|
|
mean value: 0.9121457489878543
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.33333333 0.8 1. 0.71428571 0.8
|
|
0.75 1. 0.33333333 0.75 ]
|
|
|
|
mean value: 0.6980952380952381
|
|
|
|
key: train_jcc
|
|
value: [0.86046512 0.84444444 0.86363636 0.86046512 0.81818182 0.87804878
|
|
0.77777778 0.88095238 0.80434783 0.86046512]
|
|
|
|
mean value: 0.8448784740404756
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.32428074 0.29084134 0.39761615 0.38947153 0.38101339 0.47083735
|
|
0.4072361 0.57479668 0.46769285 0.39441442]
|
|
|
|
mean value: 0.40982005596160886
|
|
|
|
key: score_time
|
|
value: [0.01101065 0.01088691 0.01111293 0.01090026 0.0111537 0.01554298
|
|
0.0109508 0.01096511 0.01096678 0.01099515]
|
|
|
|
mean value: 0.011448526382446289
|
|
|
|
key: test_mcc
|
|
value: [0.1 0.35 0.8 0.8 0.31622777 0.8
|
|
0.5 0.77459667 0.25819889 0.77459667]
|
|
|
|
mean value: 0.5473619994246965
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.55555556 0.66666667 0.88888889 0.88888889 0.66666667 0.88888889
|
|
0.75 0.875 0.625 0.875 ]
|
|
|
|
mean value: 0.7680555555555555
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.66666667 0.88888889 0.88888889 0.72727273 0.88888889
|
|
0.75 0.88888889 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7628066378066378
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.6 0.8 1. 0.66666667 1.
|
|
0.75 0.8 0.66666667 1. ]
|
|
|
|
mean value: 0.7783333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 1. 0.8 0.8 0.8 0.75 1. 0.5 0.75]
|
|
|
|
mean value: 0.765
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.55 0.675 0.9 0.9 0.65 0.9 0.75 0.875 0.625 0.875]
|
|
|
|
mean value: 0.77
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.5 0.8 0.8 0.57142857 0.8
|
|
0.6 0.8 0.4 0.75 ]
|
|
|
|
mean value: 0.6354761904761905
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00931001 0.00951576 0.00988293 0.00786901 0.00728393 0.01152086
|
|
0.00701046 0.00675249 0.00750613 0.01119971]
|
|
|
|
mean value: 0.008785128593444824
|
|
|
|
key: score_time
|
|
value: [0.01047301 0.01033044 0.00877452 0.00874352 0.00872278 0.01278138
|
|
0.00794363 0.00788903 0.00790906 0.0122633 ]
|
|
|
|
mean value: 0.009583067893981934
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 1. 1. 1. 0.63245553 1.
|
|
1. 1. 0.77459667 1. ]
|
|
|
|
mean value: 0.9039507733308835
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 1. 1. 1. 0.77777778 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9430555555555555
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 1. 1. 1. 0.75 1.
|
|
1. 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9407142857142857
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.6 1. 1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.935
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 1. 1. 1. 0.8 1. 1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9475
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 1. 1. 1. 0.6 1.
|
|
1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.9016666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07991552 0.07531953 0.07922387 0.07759547 0.07554555 0.07641506
|
|
0.08248544 0.07638526 0.07602501 0.07815456]
|
|
|
|
mean value: 0.07770652770996093
|
|
|
|
key: score_time
|
|
value: [0.01661062 0.01769233 0.01668501 0.01735091 0.01669407 0.01689482
|
|
0.01721072 0.01734948 0.0171845 0.01668453]
|
|
|
|
mean value: 0.017035698890686034
|
|
|
|
key: test_mcc
|
|
value: [0.55 0.35 0.8 0.8 0.79056942 0.8
|
|
1. 0.77459667 0.25819889 1. ]
|
|
|
|
mean value: 0.7123364974030739
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.66666667 0.88888889 0.88888889 0.88888889 0.88888889
|
|
1. 0.875 0.625 1. ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.66666667 0.88888889 0.88888889 0.90909091 0.88888889
|
|
1. 0.88888889 0.57142857 1. ]
|
|
|
|
mean value: 0.8452741702741703
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.6 0.8 1. 0.83333333 1.
|
|
1. 0.8 0.66666667 1. ]
|
|
|
|
mean value: 0.845
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 0.8 1. 0.8 1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.775 0.675 0.9 0.9 0.875 0.9 1. 0.875 0.625 1. ]
|
|
|
|
mean value: 0.8525
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.5 0.8 0.8 0.83333333 0.8
|
|
1. 0.8 0.4 1. ]
|
|
|
|
mean value: 0.7533333333333334
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00761938 0.00663662 0.00656319 0.00661063 0.00666404 0.00655985
|
|
0.006706 0.00655651 0.00683618 0.00663257]
|
|
|
|
mean value: 0.006738495826721191
|
|
|
|
key: score_time
|
|
value: [0.00826526 0.00768614 0.0077951 0.00779033 0.00775886 0.00782132
|
|
0.00778651 0.00774288 0.00777555 0.00771451]
|
|
|
|
mean value: 0.007813644409179688
|
|
|
|
key: test_mcc
|
|
value: [ 0.35 0.1 -0.15811388 0.1 -0.1 -0.5976143
|
|
0.25819889 0. 0. 0.25819889]
|
|
|
|
mean value: 0.021066959181870643
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.55555556 0.44444444 0.55555556 0.44444444 0.22222222
|
|
0.625 0.5 0.5 0.625 ]
|
|
|
|
mean value: 0.5138888888888888
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.5 0.28571429 0.6 0.44444444 0.
|
|
0.57142857 0.5 0.5 0.57142857]
|
|
|
|
mean value: 0.463968253968254
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 0.5 0.33333333 0.6 0.5 0.
|
|
0.66666667 0.5 0.5 0.66666667]
|
|
|
|
mean value: 0.48666666666666664
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 0.25 0.6 0.4 0. 0.5 0.5 0.5 0.5 ]
|
|
|
|
mean value: 0.45
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.675 0.55 0.425 0.55 0.45 0.25 0.625 0.5 0.5 0.625]
|
|
|
|
mean value: 0.515
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.33333333 0.16666667 0.42857143 0.28571429 0.
|
|
0.4 0.33333333 0.33333333 0.4 ]
|
|
|
|
mean value: 0.3180952380952381
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.95179796 0.94567418 0.94950175 0.96822572 0.93672466 0.96356082
|
|
0.97707057 1.03818297 1.03070283 1.00553799]
|
|
|
|
mean value: 0.9766979455947876
|
|
|
|
key: score_time
|
|
value: [0.09188795 0.09431767 0.08775377 0.08800793 0.09016871 0.08714199
|
|
0.09610558 0.09580159 0.09604168 0.09132028]
|
|
|
|
mean value: 0.09185471534729003
|
|
|
|
key: test_mcc
|
|
value: [0.8 0.55 0.8 1. 0.55 1.
|
|
1. 0.77459667 0.77459667 1. ]
|
|
|
|
mean value: 0.8249193338482967
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 0.77777778 0.88888889 1. 0.77777778 1.
|
|
1. 0.875 0.875 1. ]
|
|
|
|
mean value: 0.9083333333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.75 0.88888889 1. 0.8 1.
|
|
1. 0.88888889 0.85714286 1. ]
|
|
|
|
mean value: 0.9073809523809524
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 0.8 1. 0.8 1. 1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.895
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 1. 1. 0.8 1. 1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.93
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.775 0.9 1. 0.775 1. 1. 0.875 0.875 1. ]
|
|
|
|
mean value: 0.91
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.6 0.8 1. 0.66666667 1.
|
|
1. 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8416666666666667
|
|
|
|
key: train_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.92752123 0.89317346 0.80733633 0.90238857 0.82283401 0.9298923
|
|
0.91012096 0.82151413 0.85942483 0.84982991]
|
|
|
|
mean value: 0.8724035739898681
|
|
|
|
key: score_time
|
|
value: [0.19612741 0.17702603 0.17312717 0.23580909 0.18489385 0.20000648
|
|
0.20895576 0.13845778 0.27115655 0.17170072]
|
|
|
|
mean value: 0.19572608470916747
|
|
|
|
key: test_mcc
|
|
value: [0.35 0.55 0.8 1. 0.55 1.
|
|
1. 0.5 0.77459667 1. ]
|
|
|
|
mean value: 0.7524596669241483
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97467943 1. ]
|
|
|
|
mean value: 0.9974679434480896
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.77777778 0.88888889 1. 0.77777778 1.
|
|
1. 0.75 0.875 1. ]
|
|
|
|
mean value: 0.8736111111111111
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.98717949 1. ]
|
|
|
|
mean value: 0.9987179487179487
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.75 0.88888889 1. 0.8 1.
|
|
1. 0.75 0.85714286 1. ]
|
|
|
|
mean value: 0.8712698412698413
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.98701299 1. ]
|
|
|
|
mean value: 0.9987012987012986
|
|
|
|
key: test_precision
|
|
value: [0.6 0.75 0.8 1. 0.8 1. 1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.87
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 1. 0.8 1. 1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.88
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97435897 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
key: test_roc_auc
|
|
value: [0.675 0.775 0.9 1. 0.775 1. 1. 0.75 0.875 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.98717949 1. ]
|
|
|
|
mean value: 0.9987179487179487
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.6 0.8 1. 0.66666667 1.
|
|
1. 0.6 0.75 1. ]
|
|
|
|
mean value: 0.7916666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97435897 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01836395 0.00678182 0.00684881 0.00723457 0.00742078 0.00704598
|
|
0.00751448 0.00682497 0.00756073 0.0067997 ]
|
|
|
|
mean value: 0.00823957920074463
|
|
|
|
key: score_time
|
|
value: [0.01078892 0.00827122 0.00816345 0.00804162 0.00847411 0.00859022
|
|
0.00832844 0.00801682 0.00818849 0.00833344]
|
|
|
|
mean value: 0.008519673347473144
|
|
|
|
key: test_mcc
|
|
value: [-0.31622777 0.63245553 0.63245553 0.15811388 0.31622777 0.35
|
|
0.57735027 0.77459667 -0.25819889 0. ]
|
|
|
|
mean value: 0.2866772995759719
|
|
|
|
key: train_mcc
|
|
value: [0.53279352 0.50745677 0.5064147 0.45639039 0.53591229 0.42943967
|
|
0.51298918 0.46537892 0.59684919 0.41367015]
|
|
|
|
mean value: 0.49572947743384127
|
|
|
|
key: test_accuracy
|
|
value: [0.33333333 0.77777778 0.77777778 0.55555556 0.66666667 0.66666667
|
|
0.75 0.875 0.375 0.5 ]
|
|
|
|
mean value: 0.6277777777777778
|
|
|
|
key: train_accuracy
|
|
value: [0.76623377 0.75324675 0.75324675 0.72727273 0.76623377 0.71428571
|
|
0.75641026 0.73076923 0.79487179 0.70512821]
|
|
|
|
mean value: 0.7467698967698968
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.8 0.8 0.5 0.72727273 0.66666667
|
|
0.66666667 0.85714286 0.44444444 0.5 ]
|
|
|
|
mean value: 0.6362193362193362
|
|
|
|
key: train_fscore
|
|
value: [0.775 0.7654321 0.75949367 0.73417722 0.775 0.71794872
|
|
0.75949367 0.74698795 0.80952381 0.72289157]
|
|
|
|
mean value: 0.7565948701272275
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.66666667 0.66666667 0.66666667 0.66666667 0.75
|
|
1. 1. 0.4 0.5 ]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_precision
|
|
value: [0.75609756 0.73809524 0.75 0.70731707 0.73809524 0.7
|
|
0.75 0.70454545 0.75555556 0.68181818]
|
|
|
|
mean value: 0.7281524302256009
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 0.4 0.8 0.6 0.5 0.75 0.5 0.5 ]
|
|
|
|
mean value: 0.655
|
|
|
|
key: train_recall
|
|
value: [0.79487179 0.79487179 0.76923077 0.76315789 0.81578947 0.73684211
|
|
0.76923077 0.79487179 0.87179487 0.76923077]
|
|
|
|
mean value: 0.7879892037786774
|
|
|
|
key: test_roc_auc
|
|
value: [0.35 0.8 0.8 0.575 0.65 0.675 0.75 0.875 0.375 0.5 ]
|
|
|
|
mean value: 0.635
|
|
|
|
key: train_roc_auc
|
|
value: [0.76585695 0.75269906 0.75303644 0.72773279 0.7668691 0.7145749
|
|
0.75641026 0.73076923 0.79487179 0.70512821]
|
|
|
|
mean value: 0.7467948717948718
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.66666667 0.66666667 0.33333333 0.57142857 0.5
|
|
0.5 0.75 0.28571429 0.33333333]
|
|
|
|
mean value: 0.4857142857142857
|
|
|
|
key: train_jcc
|
|
value: [0.63265306 0.62 0.6122449 0.58 0.63265306 0.56
|
|
0.6122449 0.59615385 0.68 0.56603774]
|
|
|
|
mean value: 0.609198750037025
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.0487535 0.02997184 0.05316138 0.02903056 0.02904987 0.0322907
|
|
0.18433547 0.02987671 0.02673578 0.02919006]
|
|
|
|
mean value: 0.049239587783813474
|
|
|
|
key: score_time
|
|
value: [0.01463509 0.00997066 0.00984716 0.00954914 0.00983858 0.01014209
|
|
0.01031256 0.01060939 0.01125598 0.00966144]
|
|
|
|
mean value: 0.010582208633422852
|
|
|
|
key: test_mcc
|
|
value: [0.8 1. 1. 1. 0.8 1.
|
|
1. 0.77459667 0.57735027 1. ]
|
|
|
|
mean value: 0.8951946938431109
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 1. 1. 1. 0.88888889 1.
|
|
1. 0.875 0.75 1. ]
|
|
|
|
mean value: 0.9402777777777778
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 1. 1. 0.88888889 1.
|
|
1. 0.88888889 0.66666667 1. ]
|
|
|
|
mean value: 0.9333333333333333
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.8 1. 1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.93
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 1. 1. 1. 0.9 1. 1. 0.875 0.75 1. ]
|
|
|
|
mean value: 0.9425
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 1. 1. 0.8 1. 1. 0.8 0.5 1. ]
|
|
|
|
mean value: 0.89
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00983548 0.01052451 0.0106616 0.01091266 0.010957 0.01091719
|
|
0.01095319 0.01184487 0.01091933 0.01090574]
|
|
|
|
mean value: 0.010843157768249512
|
|
|
|
key: score_time
|
|
value: [0.0105567 0.01009798 0.01041722 0.01042056 0.01047301 0.01049972
|
|
0.01047277 0.01058149 0.01053119 0.01043415]
|
|
|
|
mean value: 0.010448479652404785
|
|
|
|
key: test_mcc
|
|
value: [0.8 0.8 0.8 0.79056942 1. 0.55
|
|
1. 0.77459667 0.25819889 0.57735027]
|
|
|
|
mean value: 0.7350715243220365
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.97434188 0.97435897 1. 0.97435897
|
|
0.94996791 1. 1. 1. ]
|
|
|
|
mean value: 0.987302773890115
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 0.88888889 0.88888889 0.88888889 1. 0.77777778
|
|
1. 0.875 0.625 0.75 ]
|
|
|
|
mean value: 0.8583333333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.98701299 0.98701299 1. 0.98701299
|
|
0.97435897 1. 1. 1. ]
|
|
|
|
mean value: 0.9935397935397935
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.88888889 0.90909091 1. 0.8
|
|
1. 0.88888889 0.57142857 0.66666667]
|
|
|
|
mean value: 0.8502741702741703
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.98734177 0.98701299 1. 0.98701299
|
|
0.975 1. 1. 1. ]
|
|
|
|
mean value: 0.9936367746177872
|
|
|
|
key: test_precision
|
|
value: [0.8 0.8 0.8 0.83333333 1. 0.8
|
|
1. 0.8 0.66666667 1. ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.975 0.97435897 1. 0.97435897
|
|
0.95121951 1. 1. 1. ]
|
|
|
|
mean value: 0.987493746091307
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.8 1. 1. 0.5 0.5]
|
|
|
|
mean value: 0.88
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.9 0.9 0.875 1. 0.775 1. 0.875 0.625 0.75 ]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.98684211 0.98717949 1. 0.98717949
|
|
0.97435897 1. 1. 1. ]
|
|
|
|
mean value: 0.9935560053981106
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.8 0.83333333 1. 0.66666667
|
|
1. 0.8 0.4 0.5 ]
|
|
|
|
mean value: 0.76
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.975 0.97435897 1. 0.97435897
|
|
0.95121951 1. 1. 1. ]
|
|
|
|
mean value: 0.987493746091307
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00919414 0.00788784 0.00760937 0.00748897 0.00734115 0.00743103
|
|
0.00754428 0.00746417 0.00744605 0.00730228]
|
|
|
|
mean value: 0.007670927047729492
|
|
|
|
key: score_time
|
|
value: [0.01072121 0.0088973 0.00895858 0.00848174 0.00847363 0.00847983
|
|
0.00860381 0.00846124 0.00856113 0.00795674]
|
|
|
|
mean value: 0.008759522438049316
|
|
|
|
key: test_mcc
|
|
value: [0.55 0.1 0.8 0.8 0.31622777 0.55
|
|
0.57735027 0.57735027 0. 0.57735027]
|
|
|
|
mean value: 0.4848278573585716
|
|
|
|
key: train_mcc
|
|
value: [0.61257733 0.66463964 0.6374073 0.55962522 0.63928106 0.58485583
|
|
0.64102564 0.56577895 0.66688593 0.56428809]
|
|
|
|
mean value: 0.6136364988556005
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.55555556 0.88888889 0.88888889 0.66666667 0.77777778
|
|
0.75 0.75 0.5 0.75 ]
|
|
|
|
mean value: 0.7305555555555555
|
|
|
|
key: train_accuracy
|
|
value: [0.80519481 0.83116883 0.81818182 0.77922078 0.81818182 0.79220779
|
|
0.82051282 0.78205128 0.83333333 0.78205128]
|
|
|
|
mean value: 0.8062104562104563
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.5 0.88888889 0.88888889 0.72727273 0.8
|
|
0.66666667 0.66666667 0.5 0.66666667]
|
|
|
|
mean value: 0.7055050505050505
|
|
|
|
key: train_fscore
|
|
value: [0.8 0.82666667 0.81578947 0.76712329 0.80555556 0.78378378
|
|
0.82051282 0.77333333 0.83544304 0.78481013]
|
|
|
|
mean value: 0.8013018085764565
|
|
|
|
key: test_precision
|
|
value: [0.75 0.5 0.8 1. 0.66666667 0.8
|
|
1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.8016666666666666
|
|
|
|
key: train_precision
|
|
value: [0.83333333 0.86111111 0.83783784 0.8 0.85294118 0.80555556
|
|
0.82051282 0.80555556 0.825 0.775 ]
|
|
|
|
mean value: 0.8216847390376802
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 1. 0.8 0.8 0.8 0.5 0.5 0.5 0.5 ]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_recall
|
|
value: [0.76923077 0.79487179 0.79487179 0.73684211 0.76315789 0.76315789
|
|
0.82051282 0.74358974 0.84615385 0.79487179]
|
|
|
|
mean value: 0.7827260458839406
|
|
|
|
key: test_roc_auc
|
|
value: [0.775 0.55 0.9 0.9 0.65 0.775 0.75 0.75 0.5 0.75 ]
|
|
|
|
mean value: 0.73
|
|
|
|
key: train_roc_auc
|
|
value: [0.80566802 0.83164642 0.81848853 0.77867746 0.81747638 0.79183536
|
|
0.82051282 0.78205128 0.83333333 0.78205128]
|
|
|
|
mean value: 0.8061740890688258
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.33333333 0.8 0.8 0.57142857 0.66666667
|
|
0.5 0.5 0.33333333 0.5 ]
|
|
|
|
mean value: 0.5604761904761905
|
|
|
|
key: train_jcc
|
|
value: [0.66666667 0.70454545 0.68888889 0.62222222 0.6744186 0.64444444
|
|
0.69565217 0.63043478 0.7173913 0.64583333]
|
|
|
|
mean value: 0.6690497875621738
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00780702 0.00762272 0.00795984 0.00789499 0.00773883 0.00785589
|
|
0.00779939 0.00809336 0.00800991 0.00795817]
|
|
|
|
mean value: 0.007874011993408203
|
|
|
|
key: score_time
|
|
value: [0.00883174 0.00864387 0.00875735 0.00865006 0.00847197 0.00864172
|
|
0.00876093 0.00859165 0.00871468 0.00850964]
|
|
|
|
mean value: 0.008657360076904297
|
|
|
|
key: test_mcc
|
|
value: [0.31622777 0.15811388 0.8 0.79056942 1. 1.
|
|
0.77459667 0.5 0.5 1. ]
|
|
|
|
mean value: 0.6839507733308835
|
|
|
|
key: train_mcc
|
|
value: [1. 0.92480439 0.94935876 1. 0.94804318 0.94935876
|
|
0.87904907 0.85634884 0.97467943 0.90219371]
|
|
|
|
mean value: 0.93838361447064
|
|
|
|
key: test_accuracy
|
|
value: [0.66666667 0.55555556 0.88888889 0.88888889 1. 1.
|
|
0.875 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8375
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.96103896 0.97402597 1. 0.97402597 0.97402597
|
|
0.93589744 0.92307692 0.98717949 0.94871795]
|
|
|
|
mean value: 0.9677988677988678
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.6 0.88888889 0.90909091 1. 1.
|
|
0.88888889 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8358297258297258
|
|
|
|
key: train_fscore
|
|
value: [1. 0.96296296 0.97368421 1. 0.97368421 0.97435897
|
|
0.93975904 0.92857143 0.98734177 0.95121951]
|
|
|
|
mean value: 0.9691582107437596
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.8 0.83333333 1. 1.
|
|
0.8 0.75 0.75 1. ]
|
|
|
|
mean value: 0.81
|
|
|
|
key: train_precision
|
|
value: [1. 0.92857143 1. 1. 0.97368421 0.95
|
|
0.88636364 0.86666667 0.975 0.90697674]
|
|
|
|
mean value: 0.9487262686314094
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 1. 1. 1. 1. 1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.94871795 1. 0.97368421 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9922402159244265
|
|
|
|
key: test_roc_auc
|
|
value: [0.65 0.575 0.9 0.875 1. 1. 0.875 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8375
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.96052632 0.97435897 1. 0.97402159 0.97435897
|
|
0.93589744 0.92307692 0.98717949 0.94871795]
|
|
|
|
mean value: 0.9678137651821862
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.42857143 0.8 0.83333333 1. 1.
|
|
0.8 0.6 0.6 1. ]
|
|
|
|
mean value: 0.7461904761904762
|
|
|
|
key: train_jcc
|
|
value: [1. 0.92857143 0.94871795 1. 0.94871795 0.95
|
|
0.88636364 0.86666667 0.975 0.90697674]
|
|
|
|
mean value: 0.9411014373223675
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00939584 0.00922012 0.00787425 0.00773025 0.00760698 0.00704384
|
|
0.00760889 0.00750351 0.00746274 0.00742221]
|
|
|
|
mean value: 0.007886862754821778
|
|
|
|
key: score_time
|
|
value: [0.01045752 0.01001 0.00872231 0.0086472 0.00853968 0.00859547
|
|
0.0085783 0.00867987 0.00851488 0.008075 ]
|
|
|
|
mean value: 0.0088820219039917
|
|
|
|
key: test_mcc
|
|
value: [0.5976143 0.55 0.8 0.79056942 0.63245553 1.
|
|
0.77459667 0.5 0. 1. ]
|
|
|
|
mean value: 0.6645235920984451
|
|
|
|
key: train_mcc
|
|
value: [0.90109146 1. 0.97435897 0.90109146 0.70243936 0.75611265
|
|
0.94996791 1. 0.46770717 0.9258201 ]
|
|
|
|
mean value: 0.8578589074717703
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.77777778 0.88888889 0.88888889 0.77777778 1.
|
|
0.875 0.75 0.5 1. ]
|
|
|
|
mean value: 0.8236111111111111
|
|
|
|
key: train_accuracy
|
|
value: [0.94805195 1. 0.98701299 0.94805195 0.83116883 0.87012987
|
|
0.97435897 1. 0.67948718 0.96153846]
|
|
|
|
mean value: 0.91998001998002
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.75 0.88888889 0.90909091 0.75 1.
|
|
0.88888889 0.75 0.6 1. ]
|
|
|
|
mean value: 0.8203535353535354
|
|
|
|
key: train_fscore
|
|
value: [0.94594595 1. 0.98701299 0.95 0.79365079 0.85294118
|
|
0.975 1. 0.75728155 0.96296296]
|
|
|
|
mean value: 0.9224795419441336
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.8 0.83333333 1. 1.
|
|
0.8 0.75 0.5 1. ]
|
|
|
|
mean value: 0.8433333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 0.9047619 1. 0.96666667
|
|
0.95121951 1. 0.609375 0.92857143]
|
|
|
|
mean value: 0.9360594512195122
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 1. 1. 0.6 1. 1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.835
|
|
|
|
key: train_recall
|
|
value: [0.8974359 1. 0.97435897 1. 0.65789474 0.76315789
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9292847503373819
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.775 0.9 0.875 0.8 1. 0.875 0.75 0.5 1. ]
|
|
|
|
mean value: 0.8225
|
|
|
|
key: train_roc_auc
|
|
value: [0.94871795 1. 0.98717949 0.94871795 0.82894737 0.86875843
|
|
0.97435897 1. 0.67948718 0.96153846]
|
|
|
|
mean value: 0.9197705802968961
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.6 0.8 0.83333333 0.6 1.
|
|
0.8 0.6 0.42857143 1. ]
|
|
|
|
mean value: 0.7161904761904762
|
|
|
|
key: train_jcc
|
|
value: [0.8974359 1. 0.97435897 0.9047619 0.65789474 0.74358974
|
|
0.95121951 1. 0.609375 0.92857143]
|
|
|
|
mean value: 0.8667207197755176
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07140326 0.05798793 0.05990863 0.05765224 0.0598948 0.06153941
|
|
0.05800366 0.05805969 0.05758524 0.05807018]
|
|
|
|
mean value: 0.06001050472259521
|
|
|
|
key: score_time
|
|
value: [0.0154562 0.01460052 0.01546645 0.01415467 0.01506758 0.01572824
|
|
0.01406693 0.01429629 0.01435161 0.01549411]
|
|
|
|
mean value: 0.01486825942993164
|
|
|
|
key: test_mcc
|
|
value: [0.8 1. 1. 1. 0.63245553 1.
|
|
1. 1. 0.77459667 1. ]
|
|
|
|
mean value: 0.9207052201275159
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 1. 1. 1. 0.77777778 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9541666666666666
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 1. 1. 0.75 1.
|
|
1. 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9527777777777777
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 1. 0.8 1. ]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.6 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 1. 1. 1. 0.8 1. 1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9575
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 1. 1. 0.6 1. 1. 1. 0.8 1. ]
|
|
|
|
mean value: 0.92
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02970028 0.02384472 0.02287769 0.03229713 0.02531958 0.02179098
|
|
0.02337933 0.03095937 0.02274895 0.02188182]
|
|
|
|
mean value: 0.025479984283447266
|
|
|
|
key: score_time
|
|
value: [0.0158186 0.0168395 0.01607776 0.02253652 0.02111721 0.01882815
|
|
0.02102876 0.02306414 0.01536655 0.0155549 ]
|
|
|
|
mean value: 0.018623208999633788
|
|
|
|
key: test_mcc
|
|
value: [0.8 1. 1. 1. 0.63245553 1.
|
|
1. 0.77459667 0.57735027 1. ]
|
|
|
|
mean value: 0.8784402470464785
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.97435897 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 1. 1. 1. 0.77777778 1.
|
|
1. 0.875 0.75 1. ]
|
|
|
|
mean value: 0.9291666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.98701299 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9987012987012986
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 1. 1. 0.75 1.
|
|
1. 0.88888889 0.66666667 1. ]
|
|
|
|
mean value: 0.9194444444444444
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.98701299 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9987012987012986
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.6 1. 1. 1. 0.5 1. ]
|
|
|
|
mean value: 0.91
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.97435897 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 1. 1. 1. 0.8 1. 1. 0.875 0.75 1. ]
|
|
|
|
mean value: 0.9325
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.98717949 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9987179487179487
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 1. 1. 0.6 1. 1. 0.8 0.5 1. ]
|
|
|
|
mean value: 0.87
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.97435897 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01199746 0.01234484 0.01242924 0.01289916 0.01300168 0.01250005
|
|
0.01238441 0.01239324 0.01242089 0.01329017]
|
|
|
|
mean value: 0.012566113471984863
|
|
|
|
key: score_time
|
|
value: [0.0101943 0.01012278 0.01048732 0.01058817 0.0106461 0.01056409
|
|
0.01057553 0.01059437 0.01054907 0.01066661]
|
|
|
|
mean value: 0.010498833656311036
|
|
|
|
key: test_mcc
|
|
value: [0.55 0.35 0.8 0.79056942 0.79056942 0.8
|
|
0.77459667 0.5 0.25819889 0.77459667]
|
|
|
|
mean value: 0.6388531058314317
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.97434188 1. 1. 0.97435897
|
|
0.97467943 1. 0.97467943 1. ]
|
|
|
|
mean value: 0.9898059726472239
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.66666667 0.88888889 0.88888889 0.88888889 0.88888889
|
|
0.875 0.75 0.625 0.875 ]
|
|
|
|
mean value: 0.8125
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.98701299 1. 1. 0.98701299
|
|
0.98717949 1. 0.98717949 1. ]
|
|
|
|
mean value: 0.9948384948384948
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.66666667 0.88888889 0.90909091 0.90909091 0.88888889
|
|
0.85714286 0.75 0.57142857 0.85714286]
|
|
|
|
mean value: 0.8048340548340548
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.98734177 1. 1. 0.98701299
|
|
0.98734177 1. 0.98701299 1. ]
|
|
|
|
mean value: 0.9948709518329771
|
|
|
|
key: test_precision
|
|
value: [0.75 0.6 0.8 0.83333333 0.83333333 1.
|
|
1. 0.75 0.66666667 1. ]
|
|
|
|
mean value: 0.8233333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.975 1. 1. 0.97435897
|
|
0.975 1. 1. 1. ]
|
|
|
|
mean value: 0.9924358974358974
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 1. 1. 0.8 0.75 0.75 0.5 0.75]
|
|
|
|
mean value: 0.805
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97435897 1. ]
|
|
|
|
mean value: 0.9974358974358974
|
|
|
|
key: test_roc_auc
|
|
value: [0.775 0.675 0.9 0.875 0.875 0.9 0.875 0.75 0.625 0.875]
|
|
|
|
mean value: 0.8125
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.98684211 1. 1. 0.98717949
|
|
0.98717949 1. 0.98717949 1. ]
|
|
|
|
mean value: 0.9948380566801619
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.5 0.8 0.83333333 0.83333333 0.8
|
|
0.75 0.6 0.4 0.75 ]
|
|
|
|
mean value: 0.6866666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.975 1. 1. 0.97435897
|
|
0.975 1. 0.97435897 1. ]
|
|
|
|
mean value: 0.9898717948717949
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.59
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0684216 0.06198835 0.06150103 0.06125021 0.05035377 0.06086302
|
|
0.06419754 0.05684948 0.05581045 0.06373596]
|
|
|
|
mean value: 0.06049714088439941
|
|
|
|
key: score_time
|
|
value: [0.00865507 0.00866818 0.00822926 0.008461 0.00909662 0.00912499
|
|
0.00889039 0.00892878 0.0091598 0.00874305]
|
|
|
|
mean value: 0.008795714378356934
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 1. 1. 1. 0.63245553 1.
|
|
1. 1. 0.77459667 1. ]
|
|
|
|
mean value: 0.9039507733308835
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 1. 1. 1. 0.77777778 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9430555555555555
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 1. 1. 1. 0.75 1.
|
|
1. 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9407142857142857
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.6 1. 1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.935
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 1. 1. 1. 0.8 1. 1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9475
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 1. 1. 1. 0.6 1.
|
|
1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.9016666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00811529 0.00809407 0.01050186 0.00907612 0.00753951 0.0079627
|
|
0.00725293 0.00733447 0.00738692 0.00783324]
|
|
|
|
mean value: 0.008109712600708007
|
|
|
|
key: score_time
|
|
value: [0.01106501 0.01026535 0.00954747 0.00802541 0.0085423 0.00829649
|
|
0.00797725 0.00805521 0.00838113 0.00803828]
|
|
|
|
mean value: 0.008819389343261718
|
|
|
|
key: test_mcc
|
|
value: [ 0.05976143 -0.31622777 0.31622777 0. 0.47809144 -0.05976143
|
|
0.25819889 0.57735027 0. 0. ]
|
|
|
|
mean value: 0.13136406026705444
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.55555556 0.44444444 0.66666667 0.44444444 0.66666667 0.44444444
|
|
0.625 0.75 0.5 0.5 ]
|
|
|
|
mean value: 0.5597222222222222
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.33333333 0. 0.57142857 0. 0.57142857 0.28571429
|
|
0.57142857 0.66666667 0.33333333 0. ]
|
|
|
|
mean value: 0.33333333333333337
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0. 0.66666667 0. 1. 0.5
|
|
0.66666667 1. 0.5 0. ]
|
|
|
|
mean value: 0.48333333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.25 0. 0.5 0. 0.4 0.2 0.5 0.5 0.25 0. ]
|
|
|
|
mean value: 0.26
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.525 0.4 0.65 0.5 0.7 0.475 0.625 0.75 0.5 0.5 ]
|
|
|
|
mean value: 0.5625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.2 0. 0.4 0. 0.4 0.16666667
|
|
0.4 0.5 0.2 0. ]
|
|
|
|
mean value: 0.22666666666666668
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.03
|
|
|
|
Accuracy on Blind test: 0.51
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01002216 0.0099225 0.00757122 0.00743675 0.00746179 0.00744224
|
|
0.00728893 0.00753307 0.00745034 0.00752497]
|
|
|
|
mean value: 0.007965397834777833
|
|
|
|
key: score_time
|
|
value: [0.01054311 0.00975561 0.008003 0.00801349 0.00793147 0.0078249
|
|
0.00790191 0.00796008 0.00788474 0.00783968]
|
|
|
|
mean value: 0.008365797996520995
|
|
|
|
key: test_mcc
|
|
value: [0.55 0.35 0.8 1. 1. 1.
|
|
0.77459667 0.5 0.5 1. ]
|
|
|
|
mean value: 0.7474596669241483
|
|
|
|
key: train_mcc
|
|
value: [0.97435897 0.94804318 0.89608637 0.92240216 0.94804318 0.92240216
|
|
0.89861829 0.94871795 1. 0.97467943]
|
|
|
|
mean value: 0.9433351706022929
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.66666667 0.88888889 1. 1. 1.
|
|
0.875 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8708333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.98701299 0.97402597 0.94805195 0.96103896 0.97402597 0.96103896
|
|
0.94871795 0.97435897 1. 0.98717949]
|
|
|
|
mean value: 0.9715451215451215
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.66666667 0.88888889 1. 1. 1.
|
|
0.88888889 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8694444444444445
|
|
|
|
key: train_fscore
|
|
value: [0.98701299 0.97435897 0.94871795 0.96103896 0.97368421 0.96103896
|
|
0.95 0.97435897 1. 0.98701299]
|
|
|
|
mean value: 0.971722400406611
|
|
|
|
key: test_precision
|
|
value: [0.75 0.6 0.8 1. 1. 1. 0.8 0.75 0.75 1. ]
|
|
|
|
mean value: 0.845
|
|
|
|
key: train_precision
|
|
value: [1. 0.97435897 0.94871795 0.94871795 0.97368421 0.94871795
|
|
0.92682927 0.97435897 1. 1. ]
|
|
|
|
mean value: 0.9695385273690793
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 1. 1. 1. 1. 1. 0.75 0.75 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_recall
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:183: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:186: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.97435897 0.97435897 0.94871795 0.97368421 0.97368421 0.97368421
|
|
0.97435897 0.97435897 1. 0.97435897]
|
|
|
|
mean value: 0.9741565452091768
|
|
|
|
key: test_roc_auc
|
|
value: [0.775 0.675 0.9 1. 1. 1. 0.875 0.75 0.75 1. ]
|
|
|
|
mean value: 0.8725
|
|
|
|
key: train_roc_auc
|
|
value: [0.98717949 0.97402159 0.94804318 0.96120108 0.97402159 0.96120108
|
|
0.94871795 0.97435897 1. 0.98717949]
|
|
|
|
mean value: 0.9715924426450743
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.5 0.8 1. 1. 1. 0.8 0.6 0.6 1. ]
|
|
|
|
mean value: 0.79
|
|
|
|
key: train_jcc
|
|
value: [0.97435897 0.95 0.90243902 0.925 0.94871795 0.925
|
|
0.9047619 0.95 1. 0.97435897]
|
|
|
|
mean value: 0.9454636826588046
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.07551575 0.0629611 0.06268406 0.06178474 0.06372309 0.06233025
|
|
0.06263781 0.06301665 0.06373763 0.06232142]
|
|
|
|
mean value: 0.06407124996185302
|
|
|
|
key: score_time
|
|
value: [0.00872803 0.00875998 0.00883865 0.0087626 0.00883889 0.00868249
|
|
0.00857091 0.00899076 0.00884104 0.00882053]
|
|
|
|
mean value: 0.008783388137817382
|
|
|
|
key: test_mcc
|
|
value: [0.8 0.35 0.8 1. 1. 0.8
|
|
1. 0.77459667 0.5 0.77459667]
|
|
|
|
mean value: 0.7799193338482967
|
|
|
|
key: train_mcc
|
|
value: [0.94804318 0.94804318 0.94804318 0.92240216 0.94804318 0.94804318
|
|
0.94871795 1. 1. 0.94871795]
|
|
|
|
mean value: 0.9560053981106613
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 0.66666667 0.88888889 1. 1. 0.88888889
|
|
1. 0.875 0.75 0.875 ]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.97402597 0.97402597 0.97402597 0.96103896 0.97402597 0.97402597
|
|
0.97435897 1. 1. 0.97435897]
|
|
|
|
mean value: 0.977988677988678
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.66666667 0.88888889 1. 1. 0.88888889
|
|
1. 0.88888889 0.75 0.85714286]
|
|
|
|
mean value: 0.8829365079365079
|
|
|
|
key: train_fscore
|
|
value: [0.97435897 0.97435897 0.97435897 0.96103896 0.97368421 0.97368421
|
|
0.97435897 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9780202253886464
|
|
|
|
key: test_precision
|
|
value: [0.8 0.6 0.8 1. 1. 1. 1. 0.8 0.75 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_precision
|
|
value: [0.97435897 0.97435897 0.97435897 0.94871795 0.97368421 0.97368421
|
|
0.97435897 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9767881241565453
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 1. 1. 1. 0.8 1. 1. 0.75 0.75]
|
|
|
|
mean value: 0.905
|
|
|
|
key: train_recall
|
|
value: [0.97435897 0.97435897 0.97435897 0.97368421 0.97368421 0.97368421
|
|
0.97435897 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9792847503373819
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.675 0.9 1. 1. 0.9 1. 0.875 0.75 0.875]
|
|
|
|
mean value: 0.8875
|
|
|
|
key: train_roc_auc
|
|
value: [0.97402159 0.97402159 0.97402159 0.96120108 0.97402159 0.97402159
|
|
0.97435897 1. 1. 0.97435897]
|
|
|
|
mean value: 0.9780026990553307
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.5 0.8 1. 1. 0.8 1. 0.8 0.6 0.75]
|
|
|
|
mean value: 0.805
|
|
|
|
key: train_jcc
|
|
value: [0.95 0.95 0.95 0.925 0.94871795 0.94871795
|
|
0.95 1. 1. 0.95 ]
|
|
|
|
mean value: 0.9572435897435897
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0179987 0.01539111 0.01446462 0.01311445 0.01305366 0.01297665
|
|
0.01468205 0.01300359 0.01401901 0.01403546]
|
|
|
|
mean value: 0.014273929595947265
|
|
|
|
key: score_time
|
|
value: [0.01068687 0.00844073 0.00901723 0.00842404 0.00852823 0.00848007
|
|
0.00915885 0.00892854 0.00850725 0.00879216]
|
|
|
|
mean value: 0.008896398544311523
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.62994079 0.73214286 0.49099025 0.87287156 0.87287156
|
|
0.46428571 0.32732684 0.64465837 0.875 ]
|
|
|
|
mean value: 0.6426485720764821
|
|
|
|
key: train_mcc
|
|
value: [0.808911 0.79446135 0.78111679 0.82629176 0.83951407 0.76668815
|
|
0.81031543 0.8251228 0.81092683 0.81027501]
|
|
|
|
mean value: 0.8073623185403057
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.8125 0.86666667 0.73333333 0.93333333 0.93333333
|
|
0.73333333 0.66666667 0.8 0.93333333]
|
|
|
|
mean value: 0.81625
|
|
|
|
key: train_accuracy
|
|
value: [0.90441176 0.89705882 0.89051095 0.91240876 0.91970803 0.88321168
|
|
0.90510949 0.91240876 0.90510949 0.90510949]
|
|
|
|
mean value: 0.903504723057106
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.8 0.85714286 0.75 0.92307692 0.92307692
|
|
0.75 0.70588235 0.84210526 0.93333333]
|
|
|
|
mean value: 0.8262395430506886
|
|
|
|
key: train_fscore
|
|
value: [0.9037037 0.89552239 0.89051095 0.91044776 0.91970803 0.88571429
|
|
0.90510949 0.91044776 0.90225564 0.9037037 ]
|
|
|
|
mean value: 0.9027123709820484
|
|
|
|
key: test_precision
|
|
value: [0.7 0.85714286 0.85714286 0.66666667 1. 1.
|
|
0.75 0.66666667 0.72727273 1. ]
|
|
|
|
mean value: 0.8224891774891775
|
|
|
|
key: train_precision
|
|
value: [0.91044776 0.90909091 0.89705882 0.93846154 0.92647059 0.87323944
|
|
0.89855072 0.92424242 0.92307692 0.91044776]
|
|
|
|
mean value: 0.911108689028196
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.88235294 0.88405797 0.88405797 0.91304348 0.89855072
|
|
0.91176471 0.89705882 0.88235294 0.89705882]
|
|
|
|
mean value: 0.8947357203751065
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.8125 0.86607143 0.74107143 0.92857143 0.92857143
|
|
0.73214286 0.66071429 0.78571429 0.9375 ]
|
|
|
|
mean value: 0.8142857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.90441176 0.89705882 0.8905584 0.91261722 0.91975703 0.88309889
|
|
0.90515772 0.91229753 0.90494459 0.90505115]
|
|
|
|
mean value: 0.9034953111679455
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.66666667 0.75 0.6 0.85714286 0.85714286
|
|
0.6 0.54545455 0.72727273 0.875 ]
|
|
|
|
mean value: 0.711504329004329
|
|
|
|
key: train_jcc
|
|
value: [0.82432432 0.81081081 0.80263158 0.83561644 0.85135135 0.79487179
|
|
0.82666667 0.83561644 0.82191781 0.82432432]
|
|
|
|
mean value: 0.8228131536228147
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.36799169 0.37216139 0.3838408 0.37895703 0.3886342 0.3847723
|
|
0.39137197 0.39352298 0.38147902 0.38831353]
|
|
|
|
mean value: 0.3831044912338257
|
|
|
|
key: score_time
|
|
value: [0.00858855 0.00922418 0.00917625 0.00932384 0.00937819 0.00943565
|
|
0.00947142 0.00946307 0.00953293 0.00940537]
|
|
|
|
mean value: 0.009299945831298829
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.8819171 0.875 0.49099025 1. 0.73214286
|
|
0.6000992 0.87287156 0.64465837 0.875 ]
|
|
|
|
mean value: 0.7602620132524002
|
|
|
|
key: train_mcc
|
|
value: [0.85294118 1. 1. 0.88360693 1. 1.
|
|
1. 1. 0.88355744 1. ]
|
|
|
|
mean value: 0.9620105545903546
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.9375 0.93333333 0.73333333 1. 0.86666667
|
|
0.8 0.93333333 0.8 0.93333333]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_accuracy
|
|
value: [0.92647059 1. 1. 0.94160584 1. 1.
|
|
1. 1. 0.94160584 1. ]
|
|
|
|
mean value: 0.9809682267067411
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.94117647 0.93333333 0.75 1. 0.85714286
|
|
0.82352941 0.94117647 0.84210526 0.93333333]
|
|
|
|
mean value: 0.8845326551673302
|
|
|
|
key: train_fscore
|
|
value: [0.92647059 1. 1. 0.94117647 1. 1.
|
|
1. 1. 0.94029851 1. ]
|
|
|
|
mean value: 0.9807945566286216
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.88888889 0.875 0.66666667 1. 0.85714286
|
|
0.77777778 0.88888889 0.72727273 1. ]
|
|
|
|
mean value: 0.8459415584415584
|
|
|
|
key: train_precision
|
|
value: [0.92647059 1. 1. 0.95522388 1. 1.
|
|
1. 1. 0.95454545 1. ]
|
|
|
|
mean value: 0.9836239923377763
|
|
|
|
key: test_recall
|
|
value: [0.875 1. 1. 0.85714286 1. 0.85714286
|
|
0.875 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9339285714285714
|
|
|
|
key: train_recall
|
|
value: [0.92647059 1. 1. 0.92753623 1. 1.
|
|
1. 1. 0.92647059 1. ]
|
|
|
|
mean value: 0.9780477408354646
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.9375 0.9375 0.74107143 1. 0.86607143
|
|
0.79464286 0.92857143 0.78571429 0.9375 ]
|
|
|
|
mean value: 0.8741071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.92647059 1. 1. 0.94170929 1. 1.
|
|
1. 1. 0.94149616 1. ]
|
|
|
|
mean value: 0.9809676044330776
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.88888889 0.875 0.6 1. 0.75
|
|
0.7 0.88888889 0.72727273 0.875 ]
|
|
|
|
mean value: 0.8005050505050505
|
|
|
|
key: train_jcc
|
|
value: [0.8630137 1. 1. 0.88888889 1. 1.
|
|
1. 1. 0.88732394 1. ]
|
|
|
|
mean value: 0.9639226531180998
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00949454 0.00907326 0.00779319 0.00751305 0.00748301 0.0074904
|
|
0.00753403 0.00772738 0.00752568 0.00749397]
|
|
|
|
mean value: 0.007912850379943848
|
|
|
|
key: score_time
|
|
value: [0.01055598 0.01037431 0.00883818 0.00855374 0.00868988 0.00856733
|
|
0.00868702 0.00864244 0.00872922 0.00876641]
|
|
|
|
mean value: 0.009040451049804688
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.25 0.60714286 0.26189246 0.46428571 0.56407607
|
|
0.19642857 0.41931393 0.21821789 0.34247476]
|
|
|
|
mean value: 0.3701796738627931
|
|
|
|
key: train_mcc
|
|
value: [0.59233863 0.52313884 0.49254979 0.53036644 0.56781069 0.53654458
|
|
0.71021843 0.58848522 0.56432157 0.58903512]
|
|
|
|
mean value: 0.5694809310571065
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.625 0.8 0.6 0.73333333 0.73333333
|
|
0.6 0.66666667 0.6 0.66666667]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_accuracy
|
|
value: [0.78676471 0.75 0.72992701 0.75912409 0.76642336 0.75182482
|
|
0.84671533 0.7810219 0.77372263 0.77372263]
|
|
|
|
mean value: 0.7719246457707171
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.625 0.8 0.66666667 0.71428571 0.77777778
|
|
0.625 0.76190476 0.57142857 0.73684211]
|
|
|
|
mean value: 0.7006178324599377
|
|
|
|
key: train_fscore
|
|
value: [0.81045752 0.78205128 0.77300613 0.78431373 0.80246914 0.79012346
|
|
0.82644628 0.80769231 0.7394958 0.80745342]
|
|
|
|
mean value: 0.7923509054595705
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.625 0.75 0.54545455 0.71428571 0.63636364
|
|
0.625 0.61538462 0.66666667 0.63636364]
|
|
|
|
mean value: 0.6385947385947386
|
|
|
|
key: train_precision
|
|
value: [0.72941176 0.69318182 0.67021277 0.71428571 0.69892473 0.68817204
|
|
0.94339623 0.71590909 0.8627451 0.69892473]
|
|
|
|
mean value: 0.7415163983870607
|
|
|
|
key: test_recall
|
|
value: [1. 0.625 0.85714286 0.85714286 0.71428571 1.
|
|
0.625 1. 0.5 0.875 ]
|
|
|
|
mean value: 0.8053571428571429
|
|
|
|
key: train_recall
|
|
value: [0.91176471 0.89705882 0.91304348 0.86956522 0.94202899 0.92753623
|
|
0.73529412 0.92647059 0.64705882 0.95588235]
|
|
|
|
mean value: 0.8725703324808184
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.625 0.80357143 0.61607143 0.73214286 0.75
|
|
0.59821429 0.64285714 0.60714286 0.65178571]
|
|
|
|
mean value: 0.6651785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.78676471 0.75 0.72858056 0.75831202 0.76513214 0.75053282
|
|
0.84590793 0.78207587 0.77280477 0.77504263]
|
|
|
|
mean value: 0.7715153452685422
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.45454545 0.66666667 0.5 0.55555556 0.63636364
|
|
0.45454545 0.61538462 0.4 0.58333333]
|
|
|
|
mean value: 0.5437823287823288
|
|
|
|
key: train_jcc
|
|
value: [0.68131868 0.64210526 0.63 0.64516129 0.67010309 0.65306122
|
|
0.70422535 0.67741935 0.58666667 0.67708333]
|
|
|
|
mean value: 0.6567144259023844
|
|
|
|
MCC on Blind test: 0.02
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00795698 0.00770211 0.00773144 0.00778985 0.00778174 0.00784087
|
|
0.00769806 0.0066731 0.00672388 0.0066855 ]
|
|
|
|
mean value: 0.007458353042602539
|
|
|
|
key: score_time
|
|
value: [0.00865579 0.00877738 0.00871015 0.00862479 0.00876021 0.00872946
|
|
0.00876045 0.00781512 0.00778174 0.00782132]
|
|
|
|
mean value: 0.008443641662597656
|
|
|
|
key: test_mcc
|
|
value: [ 0.25 -0.25 0.73214286 0.09449112 0.75592895 0.49099025
|
|
0.33928571 -0.13363062 0.33928571 0.19642857]
|
|
|
|
mean value: 0.2814922553488389
|
|
|
|
key: train_mcc
|
|
value: [0.50195781 0.54894692 0.44946013 0.47724794 0.37278745 0.44522592
|
|
0.41602728 0.48933032 0.41632915 0.44553401]
|
|
|
|
mean value: 0.4562846929723249
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.375 0.86666667 0.53333333 0.86666667 0.73333333
|
|
0.66666667 0.46666667 0.66666667 0.6 ]
|
|
|
|
mean value: 0.64
|
|
|
|
key: train_accuracy
|
|
value: [0.75 0.77205882 0.72262774 0.73722628 0.68613139 0.72262774
|
|
0.7080292 0.74452555 0.7080292 0.72262774]
|
|
|
|
mean value: 0.727388364104766
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.375 0.85714286 0.58823529 0.83333333 0.75
|
|
0.66666667 0.6 0.66666667 0.625 ]
|
|
|
|
mean value: 0.658704481792717
|
|
|
|
key: train_fscore
|
|
value: [0.76056338 0.7862069 0.74324324 0.75342466 0.68148148 0.72463768
|
|
0.70588235 0.73684211 0.71014493 0.72463768]
|
|
|
|
mean value: 0.7327064407151792
|
|
|
|
key: test_precision
|
|
value: [0.625 0.375 0.85714286 0.5 1. 0.66666667
|
|
0.71428571 0.5 0.71428571 0.625 ]
|
|
|
|
mean value: 0.6577380952380952
|
|
|
|
key: train_precision
|
|
value: [0.72972973 0.74025974 0.69620253 0.71428571 0.6969697 0.72463768
|
|
0.70588235 0.75384615 0.7 0.71428571]
|
|
|
|
mean value: 0.7176099315122916
|
|
|
|
key: test_recall
|
|
value: [0.625 0.375 0.85714286 0.71428571 0.71428571 0.85714286
|
|
0.625 0.75 0.625 0.625 ]
|
|
|
|
mean value: 0.6767857142857143
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.83823529 0.79710145 0.79710145 0.66666667 0.72463768
|
|
0.70588235 0.72058824 0.72058824 0.73529412]
|
|
|
|
mean value: 0.7500213128729752
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.375 0.86607143 0.54464286 0.85714286 0.74107143
|
|
0.66964286 0.44642857 0.66964286 0.59821429]
|
|
|
|
mean value: 0.6392857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.75 0.77205882 0.72208014 0.73678602 0.68627451 0.72261296
|
|
0.70801364 0.74435209 0.7081202 0.72271952]
|
|
|
|
mean value: 0.7273017902813299
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.23076923 0.75 0.41666667 0.71428571 0.6
|
|
0.5 0.42857143 0.5 0.45454545]
|
|
|
|
mean value: 0.504938394938395
|
|
|
|
key: train_jcc
|
|
value: [0.61363636 0.64772727 0.59139785 0.6043956 0.51685393 0.56818182
|
|
0.54545455 0.58333333 0.5505618 0.56818182]
|
|
|
|
mean value: 0.57897243357102
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00700665 0.00714684 0.00702357 0.007092 0.00712657 0.0071609
|
|
0.00703955 0.00731921 0.00701737 0.00706315]
|
|
|
|
mean value: 0.007099580764770508
|
|
|
|
key: score_time
|
|
value: [0.00979042 0.00942588 0.00939441 0.00933671 0.00949192 0.00942111
|
|
0.009372 0.00936747 0.00985074 0.00934005]
|
|
|
|
mean value: 0.009479069709777832
|
|
|
|
key: test_mcc
|
|
value: [ 0.51639778 0.25819889 0.73214286 0.21821789 0.75592895 0.32732684
|
|
-0.02620712 0.32732684 0.73214286 0.60714286]
|
|
|
|
mean value: 0.44486186267144306
|
|
|
|
key: train_mcc
|
|
value: [0.72254413 0.69486799 0.68583647 0.72439971 0.62437433 0.68322489
|
|
0.68163703 0.68163703 0.68011153 0.65087548]
|
|
|
|
mean value: 0.6829508591825769
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.625 0.86666667 0.6 0.86666667 0.66666667
|
|
0.46666667 0.66666667 0.86666667 0.8 ]
|
|
|
|
mean value: 0.7175
|
|
|
|
key: train_accuracy
|
|
value: [0.86029412 0.84558824 0.83941606 0.86131387 0.81021898 0.83941606
|
|
0.83941606 0.83941606 0.83941606 0.82481752]
|
|
|
|
mean value: 0.8399313009875483
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.66666667 0.85714286 0.625 0.83333333 0.61538462
|
|
0.2 0.70588235 0.875 0.8 ]
|
|
|
|
mean value: 0.6956187603246426
|
|
|
|
key: train_fscore
|
|
value: [0.86524823 0.85314685 0.85135135 0.86713287 0.82191781 0.84931507
|
|
0.84507042 0.84507042 0.84285714 0.82857143]
|
|
|
|
mean value: 0.8469681591792749
|
|
|
|
key: test_precision
|
|
value: [0.7 0.6 0.85714286 0.55555556 1. 0.66666667
|
|
0.5 0.66666667 0.875 0.85714286]
|
|
|
|
mean value: 0.7278174603174603
|
|
|
|
key: train_precision
|
|
value: [0.83561644 0.81333333 0.79746835 0.83783784 0.77922078 0.80519481
|
|
0.81081081 0.81081081 0.81944444 0.80555556]
|
|
|
|
mean value: 0.8115293169994922
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 0.71428571 0.71428571 0.57142857
|
|
0.125 0.75 0.875 0.75 ]
|
|
|
|
mean value: 0.6982142857142857
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.89705882 0.91304348 0.89855072 0.86956522 0.89855072
|
|
0.88235294 0.88235294 0.86764706 0.85294118]
|
|
|
|
mean value: 0.8859121909633418
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.625 0.86607143 0.60714286 0.85714286 0.66071429
|
|
0.49107143 0.66071429 0.86607143 0.80357143]
|
|
|
|
mean value: 0.71875
|
|
|
|
key: train_roc_auc
|
|
value: [0.86029412 0.84558824 0.83887468 0.86104007 0.80978261 0.83898124
|
|
0.8397272 0.8397272 0.83962063 0.82502131]
|
|
|
|
mean value: 0.8398657289002557
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.5 0.75 0.45454545 0.71428571 0.44444444
|
|
0.11111111 0.54545455 0.77777778 0.66666667]
|
|
|
|
mean value: 0.560064935064935
|
|
|
|
key: train_jcc
|
|
value: [0.7625 0.74390244 0.74117647 0.7654321 0.69767442 0.73809524
|
|
0.73170732 0.73170732 0.72839506 0.70731707]
|
|
|
|
mean value: 0.7347907434123415
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00901961 0.00860167 0.0086236 0.00857639 0.00859451 0.00781941
|
|
0.00762987 0.00763583 0.00852704 0.00775576]
|
|
|
|
mean value: 0.008278369903564453
|
|
|
|
key: score_time
|
|
value: [0.00887251 0.00862575 0.00865078 0.00868511 0.00861216 0.00795341
|
|
0.00790024 0.00792432 0.00796032 0.00794959]
|
|
|
|
mean value: 0.008313417434692383
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.62994079 0.73214286 0.56407607 0.87287156 0.60714286
|
|
0.33928571 0.18898224 0.75592895 0.875 ]
|
|
|
|
mean value: 0.6195311823553656
|
|
|
|
key: train_mcc
|
|
value: [0.77949606 0.85331034 0.85540562 0.86948194 0.82629176 0.86939892
|
|
0.8978896 0.83947987 0.85400682 0.86868474]
|
|
|
|
mean value: 0.8513445663864698
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.8125 0.86666667 0.73333333 0.93333333 0.8
|
|
0.66666667 0.6 0.86666667 0.93333333]
|
|
|
|
mean value: 0.8025
|
|
|
|
key: train_accuracy
|
|
value: [0.88970588 0.92647059 0.9270073 0.93430657 0.91240876 0.93430657
|
|
0.94890511 0.91970803 0.9270073 0.93430657]
|
|
|
|
mean value: 0.9254132674967797
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.8 0.85714286 0.77777778 0.92307692 0.8
|
|
0.66666667 0.66666667 0.88888889 0.93333333]
|
|
|
|
mean value: 0.8137082525317819
|
|
|
|
key: train_fscore
|
|
value: [0.88888889 0.92753623 0.92957746 0.93333333 0.91044776 0.93617021
|
|
0.94814815 0.91851852 0.92647059 0.93333333]
|
|
|
|
mean value: 0.9252424481090294
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.85714286 0.85714286 0.63636364 1. 0.75
|
|
0.71428571 0.6 0.8 1. ]
|
|
|
|
mean value: 0.7992712842712842
|
|
|
|
key: train_precision
|
|
value: [0.89552239 0.91428571 0.90410959 0.95454545 0.93846154 0.91666667
|
|
0.95522388 0.92537313 0.92647059 0.94029851]
|
|
|
|
mean value: 0.9270957461683526
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.85714286 1. 0.85714286 0.85714286
|
|
0.625 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8446428571428571
|
|
|
|
key: train_recall
|
|
value: [0.88235294 0.94117647 0.95652174 0.91304348 0.88405797 0.95652174
|
|
0.94117647 0.91176471 0.92647059 0.92647059]
|
|
|
|
mean value: 0.9239556692242115
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.8125 0.86607143 0.75 0.92857143 0.80357143
|
|
0.66964286 0.58928571 0.85714286 0.9375 ]
|
|
|
|
mean value: 0.8026785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.88970588 0.92647059 0.92679028 0.93446292 0.91261722 0.93414322
|
|
0.9488491 0.91965047 0.92700341 0.93424979]
|
|
|
|
mean value: 0.9253942881500427
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.66666667 0.75 0.63636364 0.85714286 0.66666667
|
|
0.5 0.5 0.8 0.875 ]
|
|
|
|
mean value: 0.6951839826839826
|
|
|
|
key: train_jcc
|
|
value: [0.8 0.86486486 0.86842105 0.875 0.83561644 0.88
|
|
0.90140845 0.84931507 0.8630137 0.875 ]
|
|
|
|
mean value: 0.8612639573680121
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.47060013 0.6176157 0.50893569 0.47440553 0.48640704 0.68746996
|
|
0.48285437 0.49517989 0.48710799 0.6235292 ]
|
|
|
|
mean value: 0.5334105491638184
|
|
|
|
key: score_time
|
|
value: [0.01105475 0.01343441 0.01317406 0.01111579 0.01340437 0.01400685
|
|
0.01163292 0.01111388 0.01380134 0.01445436]
|
|
|
|
mean value: 0.012719273567199707
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.75 0.87287156 0.49099025 1. 0.73214286
|
|
0.47245559 0.32732684 0.75592895 0.73214286]
|
|
|
|
mean value: 0.6908455570136127
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.93333333 0.73333333 1. 0.86666667
|
|
0.73333333 0.66666667 0.86666667 0.86666667]
|
|
|
|
mean value: 0.8416666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.875 0.92307692 0.75 1. 0.85714286
|
|
0.77777778 0.70588235 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8541657688716512
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.875 1. 0.66666667 1. 0.85714286
|
|
0.7 0.66666667 0.8 0.875 ]
|
|
|
|
mean value: 0.824047619047619
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 0.85714286 1. 0.85714286
|
|
0.875 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8946428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.92857143 0.74107143 1. 0.86607143
|
|
0.72321429 0.66071429 0.85714286 0.86607143]
|
|
|
|
mean value: 0.8392857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.77777778 0.85714286 0.6 1. 0.75
|
|
0.63636364 0.54545455 0.8 0.77777778]
|
|
|
|
mean value: 0.7544516594516595
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01054811 0.01029015 0.00749731 0.00756335 0.00791764 0.00744152
|
|
0.0080564 0.0079093 0.00725198 0.00789857]
|
|
|
|
mean value: 0.008237433433532716
|
|
|
|
key: score_time
|
|
value: [0.01101589 0.00920248 0.00816894 0.00859499 0.00827861 0.00804591
|
|
0.00809813 0.00824928 0.00804496 0.00812697]
|
|
|
|
mean value: 0.008582615852355957
|
|
|
|
key: test_mcc
|
|
value: [1. 0.77459667 0.875 0.76376262 1. 0.87287156
|
|
1. 1. 0.87287156 0.875 ]
|
|
|
|
mean value: 0.9034102406955395
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.93333333 0.86666667 1. 0.93333333
|
|
1. 1. 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9475
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.88888889 0.93333333 0.875 1. 0.92307692
|
|
1. 1. 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9494808949220714
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.875 0.77777778 1. 1.
|
|
1. 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9341666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.9375 0.875 1. 0.92857143
|
|
1. 1. 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.8 0.875 0.77777778 1. 0.85714286
|
|
1. 1. 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9073809523809524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08002043 0.08039927 0.08053231 0.08475113 0.0849824 0.07976866
|
|
0.07944989 0.08087707 0.07947731 0.0836072 ]
|
|
|
|
mean value: 0.08138656616210938
|
|
|
|
key: score_time
|
|
value: [0.01744008 0.01705742 0.01676226 0.01815081 0.01665783 0.01678061
|
|
0.01786375 0.0182426 0.01667714 0.01768732]
|
|
|
|
mean value: 0.017331981658935548
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.87287156 0.66143783 1. 0.87287156
|
|
0.46428571 0.76376262 0.875 0.76376262]
|
|
|
|
mean value: 0.7905908999279945
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.93333333 0.8 1. 0.93333333
|
|
0.73333333 0.86666667 0.93333333 0.86666667]
|
|
|
|
mean value: 0.8879166666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.875 0.92307692 0.82352941 1. 0.92307692
|
|
0.75 0.85714286 0.93333333 0.85714286]
|
|
|
|
mean value: 0.8883478776125835
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.875 1. 0.7 1. 1.
|
|
0.75 1. 1. 1. ]
|
|
|
|
mean value: 0.9213888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 1. 1. 0.85714286
|
|
0.75 0.75 0.875 0.75 ]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.92857143 0.8125 1. 0.92857143
|
|
0.73214286 0.875 0.9375 0.875 ]
|
|
|
|
mean value: 0.8901785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.77777778 0.85714286 0.7 1. 0.85714286
|
|
0.6 0.75 0.875 0.75 ]
|
|
|
|
mean value: 0.805595238095238
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00745249 0.0100553 0.00690699 0.00686359 0.00682211 0.0068984
|
|
0.00738096 0.00688457 0.00711179 0.00706244]
|
|
|
|
mean value: 0.007343864440917969
|
|
|
|
key: score_time
|
|
value: [0.00842643 0.00823832 0.00787807 0.00817585 0.00788713 0.00785685
|
|
0.00784945 0.00778127 0.00796604 0.00789261]
|
|
|
|
mean value: 0.007995200157165528
|
|
|
|
key: test_mcc
|
|
value: [1. 0.40451992 0.60714286 0.875 0.76376262 0.33928571
|
|
0.76376262 0.46428571 0.75592895 0.875 ]
|
|
|
|
mean value: 0.6848688380862632
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.6875 0.8 0.93333333 0.86666667 0.66666667
|
|
0.86666667 0.73333333 0.86666667 0.93333333]
|
|
|
|
mean value: 0.8354166666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.73684211 0.8 0.93333333 0.875 0.66666667
|
|
0.85714286 0.75 0.88888889 0.93333333]
|
|
|
|
mean value: 0.8441207184628238
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.63636364 0.75 0.875 0.77777778 0.625
|
|
1. 0.75 0.8 1. ]
|
|
|
|
mean value: 0.8214141414141414
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 1. 1. 0.71428571
|
|
0.75 0.75 1. 0.875 ]
|
|
|
|
mean value: 0.8821428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.6875 0.80357143 0.9375 0.875 0.66964286
|
|
0.875 0.73214286 0.85714286 0.9375 ]
|
|
|
|
mean value: 0.8375
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.58333333 0.66666667 0.875 0.77777778 0.5
|
|
0.75 0.6 0.8 0.875 ]
|
|
|
|
mean value: 0.7427777777777778
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.99027419 1.03364635 0.99537587 0.99380183 1.01279187 1.00695038
|
|
1.00400448 0.99069548 0.98471999 0.9896822 ]
|
|
|
|
mean value: 1.000194263458252
|
|
|
|
key: score_time
|
|
value: [0.09284711 0.09792686 0.09596872 0.09674335 0.097049 0.09700847
|
|
0.09636211 0.08898997 0.08923626 0.15491176]
|
|
|
|
mean value: 0.10070436000823975
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 0.875 0.76376262 1. 0.87287156
|
|
0.60714286 0.87287156 1. 0.73214286]
|
|
|
|
mean value: 0.848762565937602
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 0.93333333 0.86666667 1. 0.93333333
|
|
0.8 0.93333333 1. 0.86666667]
|
|
|
|
mean value: 0.9208333333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 0.93333333 0.875 1. 0.92307692
|
|
0.8 0.94117647 1. 0.875 ]
|
|
|
|
mean value: 0.9229939668174962
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 0.875 0.77777778 1. 1.
|
|
0.85714286 0.88888889 1. 0.875 ]
|
|
|
|
mean value: 0.9051587301587302
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
0.75 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 0.9375 0.875 1. 0.92857143
|
|
0.80357143 0.92857143 1. 0.86607143]
|
|
|
|
mean value: 0.9214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 0.875 0.77777778 1. 0.85714286
|
|
0.66666667 0.88888889 1. 0.77777778]
|
|
|
|
mean value: 0.8621031746031745
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.79997134 0.8617053 0.82350397 0.86626768 0.86152625 0.8775835
|
|
0.89794159 0.84732342 0.82997847 0.88673472]
|
|
|
|
mean value: 0.8552536249160767
|
|
|
|
key: score_time
|
|
value: [0.23055267 0.18991017 0.19632292 0.2545855 0.13287044 0.18487072
|
|
0.21556759 0.20604992 0.17664123 0.12801123]
|
|
|
|
mean value: 0.19153823852539062
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.75 0.875 0.76376262 0.87287156 0.73214286
|
|
0.60714286 0.73214286 1. 0.73214286]
|
|
|
|
mean value: 0.7947122709029568
|
|
|
|
key: train_mcc
|
|
value: [0.97100831 0.94117647 0.95710706 0.98550418 0.95630861 0.97080136
|
|
0.98550418 0.98550725 0.97122151 0.98550725]
|
|
|
|
mean value: 0.9709646177394017
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.875 0.93333333 0.86666667 0.93333333 0.86666667
|
|
0.8 0.86666667 1. 0.86666667]
|
|
|
|
mean value: 0.8945833333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.98529412 0.97058824 0.97810219 0.99270073 0.97810219 0.98540146
|
|
0.99270073 0.99270073 0.98540146 0.99270073]
|
|
|
|
mean value: 0.9853692571919279
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.875 0.93333333 0.875 0.92307692 0.85714286
|
|
0.8 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.8954729584141349
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 0.97058824 0.9787234 0.99280576 0.97810219 0.98550725
|
|
0.99259259 0.99270073 0.98550725 0.99270073]
|
|
|
|
mean value: 0.9854735376303184
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.875 0.875 0.77777778 1. 0.85714286
|
|
0.85714286 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.888095238095238
|
|
|
|
key: train_precision
|
|
value: [0.97142857 0.97058824 0.95833333 0.98571429 0.98529412 0.98550725
|
|
1. 0.98550725 0.97142857 0.98550725]
|
|
|
|
mean value: 0.9799308853976374
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 1. 1. 0.85714286 0.85714286
|
|
0.75 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.9089285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 0.97058824 1. 1. 0.97101449 0.98550725
|
|
0.98529412 1. 1. 1. ]
|
|
|
|
mean value: 0.9912404092071612
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.875 0.9375 0.875 0.92857143 0.86607143
|
|
0.80357143 0.86607143 1. 0.86607143]
|
|
|
|
mean value: 0.8955357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.98529412 0.97058824 0.97794118 0.99264706 0.97815431 0.98540068
|
|
0.99264706 0.99275362 0.98550725 0.99275362]
|
|
|
|
mean value: 0.9853687127024723
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.77777778 0.875 0.77777778 0.85714286 0.75
|
|
0.66666667 0.77777778 1. 0.77777778]
|
|
|
|
mean value: 0.8148809523809524
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.94285714 0.95833333 0.98571429 0.95714286 0.97142857
|
|
0.98529412 0.98550725 0.97142857 0.98550725]
|
|
|
|
mean value: 0.9714641943734016
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01698375 0.00691271 0.00714707 0.00702286 0.00691867 0.00681043
|
|
0.00706601 0.0070343 0.00683665 0.0071075 ]
|
|
|
|
mean value: 0.007983994483947755
|
|
|
|
key: score_time
|
|
value: [0.01220894 0.00788188 0.00856304 0.00795507 0.0079546 0.00788164
|
|
0.00790548 0.00793004 0.00796223 0.00810909]
|
|
|
|
mean value: 0.008435201644897462
|
|
|
|
key: test_mcc
|
|
value: [ 0.25 -0.25 0.73214286 0.09449112 0.75592895 0.49099025
|
|
0.33928571 -0.13363062 0.33928571 0.19642857]
|
|
|
|
mean value: 0.2814922553488389
|
|
|
|
key: train_mcc
|
|
value: [0.50195781 0.54894692 0.44946013 0.47724794 0.37278745 0.44522592
|
|
0.41602728 0.48933032 0.41632915 0.44553401]
|
|
|
|
mean value: 0.4562846929723249
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.375 0.86666667 0.53333333 0.86666667 0.73333333
|
|
0.66666667 0.46666667 0.66666667 0.6 ]
|
|
|
|
mean value: 0.64
|
|
|
|
key: train_accuracy
|
|
value: [0.75 0.77205882 0.72262774 0.73722628 0.68613139 0.72262774
|
|
0.7080292 0.74452555 0.7080292 0.72262774]
|
|
|
|
mean value: 0.727388364104766
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.375 0.85714286 0.58823529 0.83333333 0.75
|
|
0.66666667 0.6 0.66666667 0.625 ]
|
|
|
|
mean value: 0.658704481792717
|
|
|
|
key: train_fscore
|
|
value: [0.76056338 0.7862069 0.74324324 0.75342466 0.68148148 0.72463768
|
|
0.70588235 0.73684211 0.71014493 0.72463768]
|
|
|
|
mean value: 0.7327064407151792
|
|
|
|
key: test_precision
|
|
value: [0.625 0.375 0.85714286 0.5 1. 0.66666667
|
|
0.71428571 0.5 0.71428571 0.625 ]
|
|
|
|
mean value: 0.6577380952380952
|
|
|
|
key: train_precision
|
|
value: [0.72972973 0.74025974 0.69620253 0.71428571 0.6969697 0.72463768
|
|
0.70588235 0.75384615 0.7 0.71428571]
|
|
|
|
mean value: 0.7176099315122916
|
|
|
|
key: test_recall
|
|
value: [0.625 0.375 0.85714286 0.71428571 0.71428571 0.85714286
|
|
0.625 0.75 0.625 0.625 ]
|
|
|
|
mean value: 0.6767857142857143
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.83823529 0.79710145 0.79710145 0.66666667 0.72463768
|
|
0.70588235 0.72058824 0.72058824 0.73529412]
|
|
|
|
mean value: 0.7500213128729752
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.375 0.86607143 0.54464286 0.85714286 0.74107143
|
|
0.66964286 0.44642857 0.66964286 0.59821429]
|
|
|
|
mean value: 0.6392857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.75 0.77205882 0.72208014 0.73678602 0.68627451 0.72261296
|
|
0.70801364 0.74435209 0.7081202 0.72271952]
|
|
|
|
mean value: 0.7273017902813299
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.23076923 0.75 0.41666667 0.71428571 0.6
|
|
0.5 0.42857143 0.5 0.45454545]
|
|
|
|
mean value: 0.504938394938395
|
|
|
|
key: train_jcc
|
|
value: [0.61363636 0.64772727 0.59139785 0.6043956 0.51685393 0.56818182
|
|
0.54545455 0.58333333 0.5505618 0.56818182]
|
|
|
|
mean value: 0.57897243357102
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.06262994 0.03505611 0.03692508 0.03597021 0.06148291 0.03540158
|
|
0.03483367 0.03505754 0.04629922 0.03492475]
|
|
|
|
mean value: 0.04185810089111328
|
|
|
|
key: score_time
|
|
value: [0.01055789 0.01049376 0.01050019 0.01044226 0.01041293 0.01036716
|
|
0.01037478 0.0117774 0.01043272 0.01040506]
|
|
|
|
mean value: 0.010576415061950683
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 0.875 0.76376262 1. 1.
|
|
0.87287156 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9268551280458139
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 0.93333333 0.86666667 1. 1.
|
|
0.93333333 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9604166666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 0.93333333 0.875 1. 1.
|
|
0.94117647 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9624019607843137
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.875 0.77777778 1. 1.
|
|
0.88888889 1. 1. 1. ]
|
|
|
|
mean value: 0.9430555555555555
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.9875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 0.9375 0.875 1. 1.
|
|
0.92857143 1. 1. 0.9375 ]
|
|
|
|
mean value: 0.9616071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 0.875 0.77777778 1. 1.
|
|
0.88888889 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9305555555555556
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01320124 0.01201296 0.01213074 0.01227403 0.01183081 0.01187682
|
|
0.01212811 0.01183581 0.01189661 0.03893161]
|
|
|
|
mean value: 0.014811873435974121
|
|
|
|
key: score_time
|
|
value: [0.0113101 0.01076269 0.01052332 0.01057029 0.0105257 0.01047421
|
|
0.0105381 0.01045918 0.01049614 0.01063371]
|
|
|
|
mean value: 0.01062934398651123
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.77459667 0.73214286 0.66143783 0.87287156 0.87287156
|
|
0.75592895 0.47245559 0.64465837 1. ]
|
|
|
|
mean value: 0.7561560053780203
|
|
|
|
key: train_mcc
|
|
value: [0.92898531 0.92737353 0.91392776 0.97120941 0.91277477 0.94318882
|
|
0.88668406 0.94323594 0.91597649 0.92791659]
|
|
|
|
mean value: 0.927127267186985
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.86666667 0.8 0.93333333 0.93333333
|
|
0.86666667 0.73333333 0.8 1. ]
|
|
|
|
mean value: 0.8683333333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.96323529 0.96323529 0.95620438 0.98540146 0.95620438 0.97080292
|
|
0.94160584 0.97080292 0.95620438 0.96350365]
|
|
|
|
mean value: 0.9627200515242593
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.85714286 0.82352941 0.92307692 0.92307692
|
|
0.88888889 0.77777778 0.84210526 1. ]
|
|
|
|
mean value: 0.8813375822663748
|
|
|
|
key: train_fscore
|
|
value: [0.96453901 0.96402878 0.95774648 0.98571429 0.95714286 0.97183099
|
|
0.94366197 0.97142857 0.95774648 0.96402878]
|
|
|
|
mean value: 0.9637868190827705
|
|
|
|
key: test_precision
|
|
value: [0.8 0.8 0.85714286 0.7 1. 1.
|
|
0.8 0.7 0.72727273 1. ]
|
|
|
|
mean value: 0.8384415584415584
|
|
|
|
key: train_precision
|
|
value: [0.93150685 0.94366197 0.93150685 0.97183099 0.94366197 0.94520548
|
|
0.90540541 0.94444444 0.91891892 0.94366197]
|
|
|
|
mean value: 0.9379804848259411
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.85714286 1. 0.85714286 0.85714286
|
|
1. 0.875 1. 1. ]
|
|
|
|
mean value: 0.9446428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 0.98529412 0.98550725 1. 0.97101449 1.
|
|
0.98529412 1. 1. 0.98529412]
|
|
|
|
mean value: 0.9912404092071612
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.86607143 0.8125 0.92857143 0.92857143
|
|
0.85714286 0.72321429 0.78571429 1. ]
|
|
|
|
mean value: 0.8651785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.96323529 0.96323529 0.95598892 0.98529412 0.95609548 0.97058824
|
|
0.94192242 0.97101449 0.95652174 0.96366155]
|
|
|
|
mean value: 0.9627557544757033
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.75 0.7 0.85714286 0.85714286
|
|
0.8 0.63636364 0.72727273 1. ]
|
|
|
|
mean value: 0.7927922077922078
|
|
|
|
key: train_jcc
|
|
value: [0.93150685 0.93055556 0.91891892 0.97183099 0.91780822 0.94520548
|
|
0.89333333 0.94444444 0.91891892 0.93055556]
|
|
|
|
mean value: 0.9303078260587425
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.64
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00955915 0.0070889 0.00685692 0.00679111 0.00704551 0.0069344
|
|
0.00679803 0.0070312 0.00693274 0.00679612]
|
|
|
|
mean value: 0.0071834087371826175
|
|
|
|
key: score_time
|
|
value: [0.01106119 0.00804234 0.00776124 0.00791359 0.00784922 0.00808263
|
|
0.0079062 0.00786209 0.00788569 0.00790429]
|
|
|
|
mean value: 0.008226847648620606
|
|
|
|
key: test_mcc
|
|
value: [ 0.12598816 0.25819889 0.73214286 0.33928571 0.87287156 0.37796447
|
|
0.19642857 -0.13363062 0.46428571 0.6000992 ]
|
|
|
|
mean value: 0.3833634515705724
|
|
|
|
key: train_mcc
|
|
value: [0.48661135 0.51745489 0.47592003 0.50667322 0.41725962 0.50373224
|
|
0.50394373 0.5339313 0.53314859 0.47473887]
|
|
|
|
mean value: 0.4953413853595016
|
|
|
|
key: test_accuracy
|
|
value: [0.5625 0.625 0.86666667 0.66666667 0.93333333 0.66666667
|
|
0.6 0.46666667 0.73333333 0.8 ]
|
|
|
|
mean value: 0.6920833333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.74264706 0.75735294 0.73722628 0.75182482 0.7080292 0.75182482
|
|
0.75182482 0.76642336 0.76642336 0.73722628]
|
|
|
|
mean value: 0.747080291970803
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.57142857 0.85714286 0.66666667 0.92307692 0.70588235
|
|
0.625 0.6 0.75 0.82352941]
|
|
|
|
mean value: 0.7110962077138547
|
|
|
|
key: train_fscore
|
|
value: [0.75177305 0.76923077 0.75 0.76712329 0.72222222 0.75714286
|
|
0.75362319 0.77142857 0.76811594 0.73913043]
|
|
|
|
mean value: 0.7549790322558434
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.66666667 0.85714286 0.625 1. 0.6
|
|
0.625 0.5 0.75 0.77777778]
|
|
|
|
mean value: 0.6957142857142857
|
|
|
|
key: train_precision
|
|
value: [0.7260274 0.73333333 0.72 0.72727273 0.69333333 0.74647887
|
|
0.74285714 0.75 0.75714286 0.72857143]
|
|
|
|
mean value: 0.7325017093010533
|
|
|
|
key: test_recall
|
|
value: [0.625 0.5 0.85714286 0.71428571 0.85714286 0.85714286
|
|
0.625 0.75 0.75 0.875 ]
|
|
|
|
mean value: 0.7410714285714286
|
|
|
|
key: train_recall
|
|
value: [0.77941176 0.80882353 0.7826087 0.8115942 0.75362319 0.76811594
|
|
0.76470588 0.79411765 0.77941176 0.75 ]
|
|
|
|
mean value: 0.7792412617220801
|
|
|
|
key: test_roc_auc
|
|
value: [0.5625 0.625 0.86607143 0.66964286 0.92857143 0.67857143
|
|
0.59821429 0.44642857 0.73214286 0.79464286]
|
|
|
|
mean value: 0.6901785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.74264706 0.75735294 0.73689258 0.75138534 0.70769395 0.75170503
|
|
0.75191816 0.76662404 0.76651748 0.73731884]
|
|
|
|
mean value: 0.7470055413469735
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.4 0.75 0.5 0.85714286 0.54545455
|
|
0.45454545 0.42857143 0.6 0.7 ]
|
|
|
|
mean value: 0.5652380952380952
|
|
|
|
key: train_jcc
|
|
value: [0.60227273 0.625 0.6 0.62222222 0.56521739 0.6091954
|
|
0.60465116 0.62790698 0.62352941 0.5862069 ]
|
|
|
|
mean value: 0.6066202190949461
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00778508 0.00735497 0.00741458 0.00750518 0.0074389 0.00734568
|
|
0.00739765 0.00764251 0.00755811 0.00754356]
|
|
|
|
mean value: 0.007498621940612793
|
|
|
|
key: score_time
|
|
value: [0.00792003 0.00796342 0.00831413 0.00790501 0.0078187 0.00797248
|
|
0.00821495 0.00799203 0.0079875 0.00809884]
|
|
|
|
mean value: 0.008018708229064942
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.62994079 0.875 0.19642857 0.87287156 0.87287156
|
|
0.32732684 0.75592895 0.64465837 0.875 ]
|
|
|
|
mean value: 0.6679967422606682
|
|
|
|
key: train_mcc
|
|
value: [0.88580789 0.91334626 0.89863497 0.83795818 0.91240409 0.83063246
|
|
0.92787101 0.91281179 0.92710997 0.92709446]
|
|
|
|
mean value: 0.8973671087701672
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.8125 0.93333333 0.6 0.93333333 0.93333333
|
|
0.66666667 0.86666667 0.8 0.93333333]
|
|
|
|
mean value: 0.8291666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.94117647 0.95588235 0.94890511 0.91240876 0.95620438 0.91240876
|
|
0.96350365 0.95620438 0.96350365 0.96350365]
|
|
|
|
mean value: 0.9473701159295835
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.82352941 0.93333333 0.57142857 0.92307692 0.92307692
|
|
0.70588235 0.88888889 0.84210526 0.93333333]
|
|
|
|
mean value: 0.8368184412766456
|
|
|
|
key: train_fscore
|
|
value: [0.93846154 0.95714286 0.95035461 0.9047619 0.95652174 0.90769231
|
|
0.96240602 0.95652174 0.96350365 0.96296296]
|
|
|
|
mean value: 0.9460329323884149
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.77777778 0.875 0.57142857 1. 1.
|
|
0.66666667 0.8 0.72727273 1. ]
|
|
|
|
mean value: 0.819592352092352
|
|
|
|
key: train_precision
|
|
value: [0.98387097 0.93055556 0.93055556 1. 0.95652174 0.96721311
|
|
0.98461538 0.94285714 0.95652174 0.97014925]
|
|
|
|
mean value: 0.9622860453071885
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 1. 0.57142857 0.85714286 0.85714286
|
|
0.75 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8660714285714286
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.98529412 0.97101449 0.82608696 0.95652174 0.85507246
|
|
0.94117647 0.97058824 0.97058824 0.95588235]
|
|
|
|
mean value: 0.9329283887468031
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.8125 0.9375 0.59821429 0.92857143 0.92857143
|
|
0.66071429 0.85714286 0.78571429 0.9375 ]
|
|
|
|
mean value: 0.8258928571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.94117647 0.95588235 0.94874254 0.91304348 0.95620205 0.91283035
|
|
0.96334186 0.95630861 0.96355499 0.96344842]
|
|
|
|
mean value: 0.9474531116794545
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.7 0.875 0.4 0.85714286 0.85714286
|
|
0.54545455 0.8 0.72727273 0.875 ]
|
|
|
|
mean value: 0.7337012987012987
|
|
|
|
key: train_jcc
|
|
value: [0.88405797 0.91780822 0.90540541 0.82608696 0.91666667 0.83098592
|
|
0.92753623 0.91666667 0.92957746 0.92857143]
|
|
|
|
mean value: 0.8983362926190229
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01016855 0.0098474 0.0079248 0.00727248 0.00720954 0.00730157
|
|
0.00728512 0.0073278 0.00719166 0.00727534]
|
|
|
|
mean value: 0.007880425453186036
|
|
|
|
key: score_time
|
|
value: [0.010952 0.00936007 0.00861073 0.00789762 0.00792098 0.00781727
|
|
0.00834227 0.00790691 0.00788283 0.00789428]
|
|
|
|
mean value: 0.008458495140075684
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.8819171 0.875 0.33928571 0.87287156 0.87287156
|
|
0.33928571 0.37796447 0.46428571 0.875 ]
|
|
|
|
mean value: 0.6475832110632131
|
|
|
|
key: train_mcc
|
|
value: [0.63408348 0.8979331 0.77817796 0.83063246 0.92951942 0.81712461
|
|
0.85977656 0.72794365 0.85721269 0.88920184]
|
|
|
|
mean value: 0.822160576316637
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.9375 0.93333333 0.66666667 0.93333333 0.93333333
|
|
0.66666667 0.66666667 0.73333333 0.93333333]
|
|
|
|
mean value: 0.8154166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.78676471 0.94852941 0.88321168 0.91240876 0.96350365 0.90510949
|
|
0.9270073 0.84671533 0.9270073 0.94160584]
|
|
|
|
mean value: 0.9041863460712752
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.93333333 0.93333333 0.66666667 0.92307692 0.92307692
|
|
0.66666667 0.61538462 0.75 0.93333333]
|
|
|
|
mean value: 0.8144871794871795
|
|
|
|
key: train_fscore
|
|
value: [0.82424242 0.94736842 0.89333333 0.90769231 0.96240602 0.91156463
|
|
0.921875 0.8173913 0.92307692 0.9375 ]
|
|
|
|
mean value: 0.904645035463338
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.875 0.625 1. 1.
|
|
0.71428571 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8430952380952381
|
|
|
|
key: train_precision
|
|
value: [0.70103093 0.96923077 0.82716049 0.96721311 1. 0.85897436
|
|
0.98333333 1. 0.96774194 1. ]
|
|
|
|
mean value: 0.9274684933438643
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 1. 0.71428571 0.85714286 0.85714286
|
|
0.625 0.5 0.75 0.875 ]
|
|
|
|
mean value: 0.8053571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 0.92647059 0.97101449 0.85507246 0.92753623 0.97101449
|
|
0.86764706 0.69117647 0.88235294 0.88235294]
|
|
|
|
mean value: 0.8974637681159421
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.9375 0.9375 0.66964286 0.92857143 0.92857143
|
|
0.66964286 0.67857143 0.73214286 0.9375 ]
|
|
|
|
mean value: 0.8169642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.78676471 0.94852941 0.88256607 0.91283035 0.96376812 0.90462489
|
|
0.92657715 0.84558824 0.92668372 0.94117647]
|
|
|
|
mean value: 0.9039109121909633
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.875 0.875 0.5 0.85714286 0.85714286
|
|
0.5 0.44444444 0.6 0.875 ]
|
|
|
|
mean value: 0.7050396825396825
|
|
|
|
key: train_jcc
|
|
value: [0.70103093 0.9 0.80722892 0.83098592 0.92753623 0.8375
|
|
0.85507246 0.69117647 0.85714286 0.88235294]
|
|
|
|
mean value: 0.8290026723550397
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07523441 0.06551671 0.06416392 0.06420016 0.06557775 0.06523657
|
|
0.06472826 0.06670904 0.06583929 0.06667423]
|
|
|
|
mean value: 0.06638803482055664
|
|
|
|
key: score_time
|
|
value: [0.01517701 0.01486087 0.01571703 0.01545548 0.01541901 0.01526618
|
|
0.01506066 0.01570487 0.01489067 0.01541162]
|
|
|
|
mean value: 0.015296339988708496
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 0.875 0.66143783 1. 0.87287156
|
|
1. 0.87287156 0.87287156 0.875 ]
|
|
|
|
mean value: 0.879388671797445
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 0.93333333 0.8 1. 0.93333333
|
|
1. 0.93333333 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9341666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 0.93333333 0.82352941 1. 0.92307692
|
|
1. 0.94117647 0.94117647 0.93333333]
|
|
|
|
mean value: 0.9377978883861237
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.88888889 0.875 0.7 1. 1.
|
|
1. 0.88888889 0.88888889 1. ]
|
|
|
|
mean value: 0.9130555555555555
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 0.9375 0.8125 1. 0.92857143
|
|
1. 0.92857143 0.92857143 0.9375 ]
|
|
|
|
mean value: 0.9348214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 0.875 0.7 1. 0.85714286
|
|
1. 0.88888889 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8862698412698412
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03459382 0.04311633 0.02597976 0.02609849 0.03556275 0.03000331
|
|
0.02980828 0.03237772 0.04908109 0.03363228]
|
|
|
|
mean value: 0.034025382995605466
|
|
|
|
key: score_time
|
|
value: [0.03137994 0.01657486 0.01867056 0.01809192 0.03612328 0.02216148
|
|
0.02189708 0.01990652 0.03687644 0.01487947]
|
|
|
|
mean value: 0.023656153678894044
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 0.875 0.76376262 0.87287156 0.87287156
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9141422841402109
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.98550418 1. 0.98550725 1.
|
|
1. 0.98550418 1. 0.98550725]
|
|
|
|
mean value: 0.9942022851330479
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 0.93333333 0.86666667 0.93333333 0.93333333
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.95375
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.99270073 1. 0.99270073 1.
|
|
1. 0.99270073 1. 0.99270073]
|
|
|
|
mean value: 0.997080291970803
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 0.93333333 0.875 0.92307692 0.92307692
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9528996983408748
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.99280576 1. 0.99270073 1.
|
|
1. 0.99259259 1. 0.99270073]
|
|
|
|
mean value: 0.9970799807842291
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.875 0.77777778 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9541666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.98571429 1. 1. 1.
|
|
1. 1. 1. 0.98550725]
|
|
|
|
mean value: 0.9971221532091097
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9589285714285715
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 0.98550725 1.
|
|
1. 0.98529412 1. 1. ]
|
|
|
|
mean value: 0.997080136402387
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 0.9375 0.875 0.92857143 0.92857143
|
|
1. 1. 1. 0.9375 ]
|
|
|
|
mean value: 0.9544642857142858
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.99264706 1. 0.99275362 1.
|
|
1. 0.99264706 1. 0.99275362]
|
|
|
|
mean value: 0.997080136402387
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 0.875 0.77777778 0.85714286 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9130952380952381
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.98571429 1. 0.98550725 1.
|
|
1. 0.98529412 1. 0.98550725]
|
|
|
|
mean value: 0.9942022896114968
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03075981 0.03832865 0.06963396 0.06718922 0.03995085 0.03923106
|
|
0.03885245 0.061131 0.04299402 0.03564787]
|
|
|
|
mean value: 0.04637188911437988
|
|
|
|
key: score_time
|
|
value: [0.02218819 0.01115394 0.01116896 0.03056479 0.02163672 0.02096963
|
|
0.02151918 0.03110862 0.01723385 0.01872468]
|
|
|
|
mean value: 0.02062685489654541
|
|
|
|
key: test_mcc
|
|
value: [0.67419986 0.75 0.87287156 0.37796447 1. 0.73214286
|
|
0.46428571 0.46428571 1. 0.76376262]
|
|
|
|
mean value: 0.7099512797956697
|
|
|
|
key: train_mcc
|
|
value: [0.95598573 0.98540068 0.97080136 0.95630861 0.97080136 0.95630861
|
|
0.97080136 0.97080136 0.97080136 0.97080136]
|
|
|
|
mean value: 0.9678811811884551
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.875 0.93333333 0.66666667 1. 0.86666667
|
|
0.73333333 0.73333333 1. 0.86666667]
|
|
|
|
mean value: 0.84875
|
|
|
|
key: train_accuracy
|
|
value: [0.97794118 0.99264706 0.98540146 0.97810219 0.98540146 0.97810219
|
|
0.98540146 0.98540146 0.98540146 0.98540146]
|
|
|
|
mean value: 0.9839201373980249
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.875 0.92307692 0.70588235 1. 0.85714286
|
|
0.75 0.75 1. 0.85714286]
|
|
|
|
mean value: 0.8560350253461708
|
|
|
|
key: train_fscore
|
|
value: [0.97777778 0.99259259 0.98550725 0.97810219 0.98550725 0.97810219
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9838765713274273
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.875 1. 0.6 1. 0.85714286
|
|
0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.8559415584415584
|
|
|
|
key: train_precision
|
|
value: [0.98507463 1. 0.98550725 0.98529412 0.98550725 0.98529412
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9867853825501648
|
|
|
|
key: test_recall
|
|
value: [1. 0.875 0.85714286 0.85714286 1. 0.85714286
|
|
0.75 0.75 1. 0.75 ]
|
|
|
|
mean value: 0.8696428571428572
|
|
|
|
key: train_recall
|
|
value: [0.97058824 0.98529412 0.98550725 0.97101449 0.98550725 0.97101449
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9810102301790282
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.875 0.92857143 0.67857143 1. 0.86607143
|
|
0.73214286 0.73214286 1. 0.875 ]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_roc_auc
|
|
value: [0.97794118 0.99264706 0.98540068 0.97815431 0.98540068 0.97815431
|
|
0.98540068 0.98540068 0.98540068 0.98540068]
|
|
|
|
mean value: 0.9839300937766412
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.77777778 0.85714286 0.54545455 1. 0.75
|
|
0.6 0.6 1. 0.75 ]
|
|
|
|
mean value: 0.7607647907647908
|
|
|
|
key: train_jcc
|
|
value: [0.95652174 0.98529412 0.97142857 0.95714286 0.97142857 0.95714286
|
|
0.97101449 0.97101449 0.97101449 0.97101449]
|
|
|
|
mean value: 0.9683016684934843
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09868574 0.10024285 0.09119558 0.08083391 0.09357262 0.08993793
|
|
0.10161471 0.10149956 0.09327483 0.08393335]
|
|
|
|
mean value: 0.0934791088104248
|
|
|
|
key: score_time
|
|
value: [0.00927162 0.00913954 0.00923514 0.00928712 0.00947499 0.00919628
|
|
0.00924039 0.00923944 0.00928307 0.00930262]
|
|
|
|
mean value: 0.009267020225524902
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8819171 0.875 0.76376262 1. 0.87287156
|
|
1. 1. 0.87287156 0.73214286]
|
|
|
|
mean value: 0.8998565698544966
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9375 0.93333333 0.86666667 1. 0.93333333
|
|
1. 1. 0.93333333 0.86666667]
|
|
|
|
mean value: 0.9470833333333334
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94117647 0.93333333 0.875 1. 0.92307692
|
|
1. 1. 0.94117647 0.875 ]
|
|
|
|
mean value: 0.9488763197586727
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.875 0.77777778 1. 1.
|
|
1. 1. 0.88888889 0.875 ]
|
|
|
|
mean value: 0.9305555555555556
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9375 0.9375 0.875 1. 0.92857143
|
|
1. 1. 0.92857143 0.86607143]
|
|
|
|
mean value: 0.9473214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88888889 0.875 0.77777778 1. 0.85714286
|
|
1. 1. 0.88888889 0.77777778]
|
|
|
|
mean value: 0.906547619047619
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00916886 0.01091504 0.01081634 0.01080394 0.0134356 0.02716422
|
|
0.01087403 0.01102185 0.01133323 0.01125884]
|
|
|
|
mean value: 0.012679195404052735
|
|
|
|
key: score_time
|
|
value: [0.01023698 0.01037884 0.01042628 0.01103234 0.01079631 0.01123476
|
|
0.01329875 0.01280212 0.01062632 0.01068163]
|
|
|
|
mean value: 0.011151432991027832
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.67419986 0.75592895 0.75592895 0.75592895 0.53452248
|
|
0.37796447 0.76376262 0.76376262 0.76376262]
|
|
|
|
mean value: 0.7027678608518798
|
|
|
|
key: train_mcc
|
|
value: [1. 0.90184995 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9901849950564579
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.86666667 0.86666667 0.86666667 0.73333333
|
|
0.66666667 0.86666667 0.86666667 0.86666667]
|
|
|
|
mean value: 0.835
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.94852941 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9948529411764706
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.76923077 0.83333333 0.83333333 0.83333333 0.6
|
|
0.61538462 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7997220426632191
|
|
|
|
key: train_fscore
|
|
value: [1. 0.94573643 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9945736434108527
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 1. 1. 1. 1.
|
|
0.8 1. 1. 1. ]
|
|
|
|
mean value: 0.9688888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.625 0.71428571 0.71428571 0.71428571 0.42857143
|
|
0.5 0.75 0.75 0.75 ]
|
|
|
|
mean value: 0.6946428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 0.89705882 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9897058823529412
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.85714286 0.85714286 0.85714286 0.71428571
|
|
0.67857143 0.875 0.875 0.875 ]
|
|
|
|
mean value: 0.8339285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.94852941 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9948529411764706
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.625 0.71428571 0.71428571 0.71428571 0.42857143
|
|
0.44444444 0.75 0.75 0.75 ]
|
|
|
|
mean value: 0.6779761904761905
|
|
|
|
key: train_jcc
|
|
value: [1. 0.89705882 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9897058823529412
|
|
|
|
MCC on Blind test: -0.02
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01154399 0.01006269 0.00780892 0.00763559 0.00742674 0.00739622
|
|
0.00755072 0.00743032 0.00746632 0.00746202]
|
|
|
|
mean value: 0.008178353309631348
|
|
|
|
key: score_time
|
|
value: [0.01060176 0.00935245 0.00819874 0.00818491 0.00788951 0.00785613
|
|
0.00786829 0.00791764 0.00788474 0.0078702 ]
|
|
|
|
mean value: 0.008362436294555664
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.62994079 0.73214286 0.49099025 0.87287156 0.87287156
|
|
0.64465837 0.6000992 0.64465837 0.875 ]
|
|
|
|
mean value: 0.7113232961000079
|
|
|
|
key: train_mcc
|
|
value: [0.83832595 0.86849267 0.85434012 0.91240409 0.86868474 0.8978896
|
|
0.88360693 0.82480818 0.86948194 0.8555278 ]
|
|
|
|
mean value: 0.8673562022561286
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.8125 0.86666667 0.73333333 0.93333333 0.93333333
|
|
0.8 0.8 0.8 0.93333333]
|
|
|
|
mean value: 0.84875
|
|
|
|
key: train_accuracy
|
|
value: [0.91911765 0.93382353 0.9270073 0.95620438 0.93430657 0.94890511
|
|
0.94160584 0.91240876 0.93430657 0.9270073 ]
|
|
|
|
mean value: 0.9334693001288106
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.82352941 0.85714286 0.75 0.92307692 0.92307692
|
|
0.84210526 0.82352941 0.84210526 0.93333333]
|
|
|
|
mean value: 0.8592899386475238
|
|
|
|
key: train_fscore
|
|
value: [0.91970803 0.9352518 0.92857143 0.95652174 0.9352518 0.94964029
|
|
0.94202899 0.91176471 0.9352518 0.92857143]
|
|
|
|
mean value: 0.9342562000313209
|
|
|
|
key: test_precision
|
|
value: [0.875 0.77777778 0.85714286 0.66666667 1. 1.
|
|
0.72727273 0.77777778 0.72727273 1. ]
|
|
|
|
mean value: 0.8408910533910534
|
|
|
|
key: train_precision
|
|
value: [0.91304348 0.91549296 0.91549296 0.95652174 0.92857143 0.94285714
|
|
0.92857143 0.91176471 0.91549296 0.90277778]
|
|
|
|
mean value: 0.9230586574290872
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 0.85714286 0.85714286 0.85714286
|
|
1. 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.8928571428571428
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.95588235 0.94202899 0.95652174 0.94202899 0.95652174
|
|
0.95588235 0.91176471 0.95588235 0.95588235]
|
|
|
|
mean value: 0.9458866155157716
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.8125 0.86607143 0.74107143 0.92857143 0.92857143
|
|
0.78571429 0.79464286 0.78571429 0.9375 ]
|
|
|
|
mean value: 0.8455357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.91911765 0.93382353 0.92689685 0.95620205 0.93424979 0.9488491
|
|
0.94170929 0.91240409 0.93446292 0.92721654]
|
|
|
|
mean value: 0.933493179880648
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.7 0.75 0.6 0.85714286 0.85714286
|
|
0.72727273 0.7 0.72727273 0.875 ]
|
|
|
|
mean value: 0.7571608946608946
|
|
|
|
key: train_jcc
|
|
value: [0.85135135 0.87837838 0.86666667 0.91666667 0.87837838 0.90410959
|
|
0.89041096 0.83783784 0.87837838 0.86666667]
|
|
|
|
mean value: 0.876884487226953
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa', 'kd_values', 'rd_values', 'electro_rr',
|
|
'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.07331181 0.06256533 0.06086087 0.06129169 0.06339002 0.06102061
|
|
0.06144905 0.06290483 0.06220293 0.06425667]
|
|
|
|
mean value: 0.0633253812789917
|
|
|
|
key: score_time
|
|
value: [0.00836086 0.00896025 0.00843644 0.00837779 0.00837159 0.00880218
|
|
0.00834155 0.00841522 0.00857377 0.00888371]
|
|
|
|
mean value: 0.008552336692810058
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.62994079 0.73214286 0.66143783 0.87287156 0.87287156
|
|
0.64465837 0.6000992 0.64465837 0.875 ]
|
|
|
|
mean value: 0.7283680535735243
|
|
|
|
key: train_mcc
|
|
value: [0.83832595 0.87000211 0.88466669 0.91240409 0.86868474 0.89863497
|
|
0.90025835 0.88476385 0.9139999 0.84173622]
|
|
|
|
mean value: 0.8813476865607188
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.8125 0.86666667 0.8 0.93333333 0.93333333
|
|
0.8 0.8 0.8 0.93333333]
|
|
|
|
mean value: 0.8554166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.91911765 0.93382353 0.94160584 0.95620438 0.93430657 0.94890511
|
|
0.94890511 0.94160584 0.95620438 0.91970803]
|
|
|
|
mean value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:203: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_config.py:206: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
0.940038643194504
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.82352941 0.85714286 0.82352941 0.92307692 0.92307692
|
|
0.84210526 0.82352941 0.84210526 0.93333333]
|
|
|
|
mean value: 0.8666428798239943
|
|
|
|
key: train_fscore
|
|
value: [0.91970803 0.93617021 0.94366197 0.95652174 0.9352518 0.95035461
|
|
0.95035461 0.94285714 0.95714286 0.92198582]
|
|
|
|
mean value: 0.9414008786946603
|
|
|
|
key: test_precision
|
|
value: [0.875 0.77777778 0.85714286 0.7 1. 1.
|
|
0.72727273 0.77777778 0.72727273 1. ]
|
|
|
|
mean value: 0.8442243867243867
|
|
|
|
key: train_precision
|
|
value: [0.91304348 0.90410959 0.91780822 0.95652174 0.92857143 0.93055556
|
|
0.91780822 0.91666667 0.93055556 0.89041096]
|
|
|
|
mean value: 0.920605141004188
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 1. 0.85714286 0.85714286
|
|
1. 0.875 1. 0.875 ]
|
|
|
|
mean value: 0.9071428571428571
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.97058824 0.97101449 0.95652174 0.94202899 0.97101449
|
|
0.98529412 0.97058824 0.98529412 0.95588235]
|
|
|
|
mean value: 0.9634697357203751
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.8125 0.86607143 0.8125 0.92857143 0.92857143
|
|
0.78571429 0.79464286 0.78571429 0.9375 ]
|
|
|
|
mean value: 0.8526785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.91911765 0.93382353 0.9413896 0.95620205 0.93424979 0.94874254
|
|
0.9491688 0.94181586 0.95641517 0.91997016]
|
|
|
|
mean value: 0.9400895140664962
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.7 0.75 0.7 0.85714286 0.85714286
|
|
0.72727273 0.7 0.72727273 0.875 ]
|
|
|
|
mean value: 0.7671608946608947
|
|
|
|
key: train_jcc
|
|
value: [0.85135135 0.88 0.89333333 0.91666667 0.87837838 0.90540541
|
|
0.90540541 0.89189189 0.91780822 0.85526316]
|
|
|
|
mean value: 0.8895503809505252
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.66
|