19822 lines
982 KiB
Text
19822 lines
982 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_cd_sl.py:548: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 858
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 858
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 168
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 175
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data with stratification according to scaling law [COMPLETE data]: 1/sqrt(x_ncols)
|
|
Input features data size: (858, 175)
|
|
Train data size: (793, 175)
|
|
Test data size: (65, 175)
|
|
y_train numbers: Counter({0: 682, 1: 111})
|
|
y_train ratio: 6.1441441441441444
|
|
|
|
y_test_numbers: Counter({0: 56, 1: 9})
|
|
y_test ratio: 6.222222222222222
|
|
-------------------------------------------------------------
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: False
|
|
Original Data
|
|
Counter({0: 682, 1: 111}) Data dim: (793, 175)
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 682, 1: 682})
|
|
(1364, 175)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 111, 1: 111})
|
|
(222, 175)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 682, 1: 682})
|
|
(1364, 175)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 682, 1: 682})
|
|
(1364, 175)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis [COMPLETE DATA]: 70/30 split
|
|
Gene name: embB
|
|
Drug name: ethambutol
|
|
|
|
Output directory: /home/tanu/git/Data/ethambutol/output/ml/tts_cd_sl/
|
|
|
|
Sanity checks:
|
|
Total input features: 175
|
|
|
|
Training data size: (793, 175)
|
|
Test data size: (65, 175)
|
|
|
|
Target feature numbers (training data): Counter({0: 682, 1: 111})
|
|
Target features ratio (training data: 6.1441441441441444
|
|
|
|
Target feature numbers (test data): Counter({0: 56, 1: 9})
|
|
Target features ratio (test data): 6.222222222222222
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
These are:
|
|
['ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0474534 0.0887804 0.17873287 0.09015036 0.12496829 0.15901494
|
|
0.1283195 0.17443967 0.16761804 0.1949439 ]
|
|
|
|
mean value: 0.13544213771820068
|
|
|
|
key: score_time
|
|
value: [0.01321054 0.02090883 0.03018737 0.02814436 0.02515435 0.03415728
|
|
0.04635882 0.01815367 0.02486181 0.0253098 ]
|
|
|
|
mean value: 0.026644682884216307
|
|
|
|
key: test_mcc
|
|
value: [0.49436016 0.50761192 0.67783439 0.66135521 0.71339159 0.64658323
|
|
0.54627358 0.57419245 0.29875024 0.57478846]
|
|
|
|
mean value: 0.5695141241239622
|
|
|
|
key: train_mcc
|
|
value: [0.69573947 0.72894477 0.67798071 0.68752657 0.67386786 0.67504963
|
|
0.71550727 0.6866705 0.69479833 0.70118527]
|
|
|
|
mean value: 0.693727039663616
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8875 0.925 0.92405063 0.93670886 0.92405063
|
|
0.89873418 0.91139241 0.86075949 0.91139241]
|
|
|
|
mean value: 0.9079588607594937
|
|
|
|
key: train_accuracy
|
|
value: [0.93267882 0.93969144 0.92987377 0.93137255 0.92857143 0.92857143
|
|
0.93697479 0.93137255 0.93277311 0.93417367]
|
|
|
|
mean value: 0.9326053563080211
|
|
|
|
key: test_fscore
|
|
value: [0.42857143 0.57142857 0.66666667 0.7 0.73684211 0.66666667
|
|
0.6 0.53333333 0.35294118 0.58823529]
|
|
|
|
mean value: 0.584468524251806
|
|
|
|
key: train_fscore
|
|
value: [0.72093023 0.74853801 0.70238095 0.71005917 0.69822485 0.70175439
|
|
0.73684211 0.70658683 0.71764706 0.72189349]
|
|
|
|
mean value: 0.7164857087826803
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 1. 0.77777778 0.875 0.85714286
|
|
0.66666667 1. 0.5 0.83333333]
|
|
|
|
mean value: 0.8109920634920635
|
|
|
|
key: train_precision
|
|
value: [0.86111111 0.90140845 0.85507246 0.86956522 0.85507246 0.84507042
|
|
0.88732394 0.88059701 0.87142857 0.88405797]
|
|
|
|
mean value: 0.8710707630308493
|
|
|
|
key: test_recall
|
|
value: [0.27272727 0.54545455 0.5 0.63636364 0.63636364 0.54545455
|
|
0.54545455 0.36363636 0.27272727 0.45454545]
|
|
|
|
mean value: 0.47727272727272724
|
|
|
|
key: train_recall
|
|
value: [0.62 0.64 0.5959596 0.6 0.59 0.6 0.63
|
|
0.59 0.61 0.61 ]
|
|
|
|
mean value: 0.6085959595959596
|
|
|
|
key: test_roc_auc
|
|
value: [0.63636364 0.74374177 0.75 0.80347594 0.81082888 0.76537433
|
|
0.75066845 0.68181818 0.61430481 0.71991979]
|
|
|
|
mean value: 0.7276495776176083
|
|
|
|
key: train_roc_auc
|
|
value: [0.80184339 0.81429038 0.78983648 0.79267101 0.78685668 0.79104235
|
|
0.80848534 0.78848534 0.79767101 0.79848534]
|
|
|
|
mean value: 0.7969667312260502
|
|
|
|
key: test_jcc
|
|
value: [0.27272727 0.4 0.5 0.53846154 0.58333333 0.5
|
|
0.42857143 0.36363636 0.21428571 0.41666667]
|
|
|
|
mean value: 0.42176823176823175
|
|
|
|
key: train_jcc
|
|
value: [0.56363636 0.59813084 0.5412844 0.55045872 0.53636364 0.54054054
|
|
0.58333333 0.5462963 0.55963303 0.56481481]
|
|
|
|
mean value: 0.5584491972895471
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [3.60542393 3.47140622 2.3955617 2.10153127 2.12939835 2.20545292
|
|
2.10316896 2.01559663 2.27213907 2.13030434]
|
|
|
|
mean value: 2.4429983377456663
|
|
|
|
key: score_time
|
|
value: [0.05521178 0.02824688 0.03234005 0.02881169 0.03445387 0.03774905
|
|
0.03767657 0.04927754 0.03254128 0.04663587]
|
|
|
|
mean value: 0.03829445838928223
|
|
|
|
key: test_mcc
|
|
value: [0.57458119 0.57839262 0.73674318 0.71339159 0.66135521 0.71339159
|
|
0.50667099 0.57419245 0.47192513 0.57478846]
|
|
|
|
mean value: 0.6105432419982048
|
|
|
|
key: train_mcc
|
|
value: [0.83609113 0.78340423 0.86005148 0.75742384 0.74441319 0.74441319
|
|
0.79014378 0.7642695 0.74989358 0.74361667]
|
|
|
|
mean value: 0.777372058017057
|
|
|
|
key: test_accuracy
|
|
value: [0.9125 0.9 0.9375 0.93670886 0.92405063 0.93670886
|
|
0.88607595 0.91139241 0.87341772 0.91139241]
|
|
|
|
mean value: 0.9129746835443038
|
|
|
|
key: train_accuracy
|
|
value: [0.96213184 0.95091164 0.96774194 0.94537815 0.94257703 0.94257703
|
|
0.95238095 0.94677871 0.94397759 0.94257703]
|
|
|
|
mean value: 0.949703191234418
|
|
|
|
key: test_fscore
|
|
value: [0.53333333 0.63636364 0.76190476 0.73684211 0.7 0.73684211
|
|
0.57142857 0.53333333 0.54545455 0.58823529]
|
|
|
|
mean value: 0.6343737686462144
|
|
|
|
key: train_fscore
|
|
value: [0.85405405 0.80225989 0.87567568 0.77966102 0.76836158 0.76836158
|
|
0.80898876 0.78651685 0.77011494 0.76571429]
|
|
|
|
mean value: 0.7979708643746889
|
|
|
|
key: test_precision
|
|
value: [1. 0.63636364 0.88888889 0.875 0.77777778 0.875
|
|
0.6 1. 0.54545455 0.83333333]
|
|
|
|
mean value: 0.8031818181818182
|
|
|
|
key: train_precision
|
|
value: [0.92941176 0.92207792 0.94186047 0.8961039 0.88311688 0.88311688
|
|
0.92307692 0.8974359 0.90540541 0.89333333]
|
|
|
|
mean value: 0.9074939373489305
|
|
|
|
key: test_recall
|
|
value: [0.36363636 0.63636364 0.66666667 0.63636364 0.63636364 0.63636364
|
|
0.54545455 0.36363636 0.54545455 0.45454545]
|
|
|
|
mean value: 0.5484848484848485
|
|
|
|
key: train_recall
|
|
value: [0.79 0.71 0.81818182 0.69 0.68 0.68
|
|
0.72 0.7 0.67 0.67 ]
|
|
|
|
mean value: 0.7128181818181818
|
|
|
|
key: test_roc_auc
|
|
value: [0.68181818 0.78919631 0.82598039 0.81082888 0.80347594 0.81082888
|
|
0.74331551 0.68181818 0.73596257 0.71991979]
|
|
|
|
mean value: 0.7603144617530807
|
|
|
|
key: train_roc_auc
|
|
value: [0.89010604 0.85010604 0.90501925 0.83848534 0.83267101 0.83267101
|
|
0.85511401 0.84348534 0.82929967 0.82848534]
|
|
|
|
mean value: 0.8505443046015629
|
|
|
|
key: test_jcc
|
|
value: [0.36363636 0.46666667 0.61538462 0.58333333 0.53846154 0.58333333
|
|
0.4 0.36363636 0.375 0.41666667]
|
|
|
|
mean value: 0.47061188811188814
|
|
|
|
key: train_jcc
|
|
value: [0.74528302 0.66981132 0.77884615 0.63888889 0.62385321 0.62385321
|
|
0.67924528 0.64814815 0.62616822 0.62037037]
|
|
|
|
mean value: 0.6654467830212485
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01778674 0.01836705 0.01631165 0.0161674 0.01639915 0.01633239
|
|
0.01626897 0.01639223 0.01605344 0.01620436]
|
|
|
|
mean value: 0.016628336906433106
|
|
|
|
key: score_time
|
|
value: [0.01443315 0.01439118 0.01305008 0.01270366 0.01318622 0.01309729
|
|
0.01328421 0.01310253 0.01316381 0.01318336]
|
|
|
|
mean value: 0.013359546661376953
|
|
|
|
key: test_mcc
|
|
value: [0.48480731 0.30776281 0.60784314 0.48361682 0.67911951 0.42011668
|
|
0.32322935 0.41317454 0.42011668 0.51178719]
|
|
|
|
mean value: 0.4651574048278671
|
|
|
|
key: train_mcc
|
|
value: [0.5430384 0.562911 0.56735299 0.52682013 0.55150264 0.55027254
|
|
0.55027254 0.53938961 0.53136366 0.54058816]
|
|
|
|
mean value: 0.5463511669174994
|
|
|
|
key: test_accuracy
|
|
value: [0.8625 0.75 0.9 0.86075949 0.89873418 0.79746835
|
|
0.79746835 0.84810127 0.79746835 0.87341772]
|
|
|
|
mean value: 0.8385917721518987
|
|
|
|
key: train_accuracy
|
|
value: [0.8569425 0.86956522 0.87096774 0.87114846 0.87254902 0.86554622
|
|
0.86554622 0.85994398 0.8627451 0.8557423 ]
|
|
|
|
mean value: 0.8650696744335883
|
|
|
|
key: test_fscore
|
|
value: [0.56 0.41176471 0.66666667 0.56 0.71428571 0.5
|
|
0.42857143 0.5 0.5 0.58333333]
|
|
|
|
mean value: 0.5424621848739496
|
|
|
|
key: train_fscore
|
|
value: [0.60465116 0.62348178 0.62601626 0.59649123 0.61603376 0.61290323
|
|
0.61290323 0.6031746 0.59836066 0.6023166 ]
|
|
|
|
mean value: 0.6096332500516068
|
|
|
|
key: test_precision
|
|
value: [0.5 0.30434783 0.66666667 0.5 0.58823529 0.38095238
|
|
0.35294118 0.46153846 0.38095238 0.53846154]
|
|
|
|
mean value: 0.46740957252466203
|
|
|
|
key: train_precision
|
|
value: [0.49367089 0.52380952 0.52380952 0.53125 0.53284672 0.51351351
|
|
0.51351351 0.5 0.50694444 0.49056604]
|
|
|
|
mean value: 0.5129924158230784
|
|
|
|
key: test_recall
|
|
value: [0.63636364 0.63636364 0.66666667 0.63636364 0.90909091 0.72727273
|
|
0.54545455 0.54545455 0.72727273 0.63636364]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_recall
|
|
value: [0.78 0.77 0.77777778 0.68 0.73 0.76
|
|
0.76 0.76 0.73 0.78 ]
|
|
|
|
mean value: 0.7527777777777778
|
|
|
|
key: test_roc_auc
|
|
value: [0.76745718 0.70223979 0.80392157 0.76671123 0.90307487 0.76804813
|
|
0.69184492 0.72125668 0.76804813 0.77406417]
|
|
|
|
mean value: 0.7666666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.82474715 0.82790375 0.83188563 0.79114007 0.81288274 0.82136808
|
|
0.82136808 0.81811075 0.80718241 0.82403909]
|
|
|
|
mean value: 0.8180627733998379
|
|
|
|
key: test_jcc
|
|
value: [0.38888889 0.25925926 0.5 0.38888889 0.55555556 0.33333333
|
|
0.27272727 0.33333333 0.33333333 0.41176471]
|
|
|
|
mean value: 0.37770845712022183
|
|
|
|
key: train_jcc
|
|
value: [0.43333333 0.45294118 0.4556213 0.425 0.44512195 0.44186047
|
|
0.44186047 0.43181818 0.42690058 0.43093923]
|
|
|
|
mean value: 0.43853966861639804
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.03735423 0.041116 0.06425524 0.04450727 0.05348682 0.03800988
|
|
0.05037355 0.03638315 0.0449841 0.01800466]
|
|
|
|
mean value: 0.042847490310668944
|
|
|
|
key: score_time
|
|
value: [0.02528548 0.02139068 0.0293529 0.02545118 0.02264762 0.02740812
|
|
0.02261806 0.0254271 0.02451134 0.01420069]
|
|
|
|
mean value: 0.023829317092895506
|
|
|
|
key: test_mcc
|
|
value: [-0.10309229 0.22988544 0.38549554 0.06681376 0.71280758 0.16073112
|
|
0.38925288 0.24065419 0.00325735 0.43676935]
|
|
|
|
mean value: 0.25225749117040186
|
|
|
|
key: train_mcc
|
|
value: [0.34313123 0.31282773 0.31079126 0.36258434 0.28090475 0.33865482
|
|
0.38265799 0.36908667 0.32684012 0.30718987]
|
|
|
|
mean value: 0.33346687688314813
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.8375 0.875 0.78481013 0.93670886 0.83544304
|
|
0.87341772 0.86075949 0.79746835 0.88607595]
|
|
|
|
mean value: 0.8487183544303798
|
|
|
|
key: train_accuracy
|
|
value: [0.86535764 0.86115007 0.86115007 0.8697479 0.85294118 0.86414566
|
|
0.86834734 0.86554622 0.85854342 0.8557423 ]
|
|
|
|
mean value: 0.8622671789613461
|
|
|
|
key: test_fscore
|
|
value: [0. 0.31578947 0.375 0.19047619 0.70588235 0.23529412
|
|
0.44444444 0.26666667 0.11111111 0.47058824]
|
|
|
|
mean value: 0.3115252592264976
|
|
|
|
key: train_fscore /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
|
|
value: [0.4 0.36942675 0.36942675 0.41509434 0.34782609 0.39751553
|
|
0.44705882 0.43529412 0.39520958 0.37575758]
|
|
|
|
mean value: 0.3952609555486557
|
|
|
|
key: test_precision
|
|
value: [0. 0.375 0.75 0.2 1. 0.33333333
|
|
0.57142857 0.5 0.14285714 0.66666667]
|
|
|
|
mean value: 0.4539285714285714
|
|
|
|
key: train_precision
|
|
value: [0.53333333 0.50877193 0.5 0.55932203 0.45901639 0.52459016
|
|
0.54285714 0.52857143 0.49253731 0.47692308]
|
|
|
|
mean value: 0.5125922816217733
|
|
|
|
key: test_recall
|
|
value: [0. 0.27272727 0.25 0.18181818 0.54545455 0.18181818
|
|
0.36363636 0.18181818 0.09090909 0.36363636]
|
|
|
|
mean value: 0.24318181818181817
|
|
|
|
key: train_recall
|
|
value: [0.32 0.29 0.29292929 0.33 0.28 0.32
|
|
0.38 0.37 0.33 0.31 ]
|
|
|
|
mean value: 0.3222929292929293
|
|
|
|
key: test_roc_auc
|
|
value: [0.46376812 0.60013175 0.61764706 0.53208556 0.77272727 0.56149733
|
|
0.65975936 0.57620321 0.5013369 0.6671123 ]
|
|
|
|
mean value: 0.5952268852204914
|
|
|
|
key: train_roc_auc
|
|
value: [0.6371615 0.6221615 0.62284901 0.64382736 0.61312704 0.63638436
|
|
0.66394137 0.65812704 0.6373127 0.6273127 ]
|
|
|
|
mean value: 0.6362204586206717
|
|
|
|
key: test_jcc
|
|
value: [0. 0.1875 0.23076923 0.10526316 0.54545455 0.13333333
|
|
0.28571429 0.15384615 0.05882353 0.30769231]
|
|
|
|
mean value: 0.20083965441163584
|
|
|
|
key: train_jcc
|
|
value: [0.25 0.2265625 0.2265625 0.26190476 0.21052632 0.24806202
|
|
0.28787879 0.27819549 0.24626866 0.23134328]
|
|
|
|
mean value: 0.24673043100972114
|
|
|
|
MCC on Blind test: 0.03
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.04181695 0.0169158 0.02974534 0.01720977 0.01437426 0.03081775
|
|
0.03508019 0.01550221 0.03039002 0.02925754]
|
|
|
|
mean value: 0.02611098289489746
|
|
|
|
key: score_time
|
|
value: [0.12297058 0.04593039 0.05504537 0.06402946 0.04561687 0.09929895
|
|
0.05882621 0.05893588 0.04028583 0.04817009]
|
|
|
|
mean value: 0.06391096115112305
|
|
|
|
key: test_mcc
|
|
value: [ 0.11224603 0.16855623 0.46987149 0.28152101 0.16794369 0.16794369
|
|
0.11138831 0. -0.09288407 0.28152101]
|
|
|
|
mean value: 0.16681073908647326
|
|
|
|
key: train_mcc
|
|
value: [0.44887065 0.40655068 0.38581163 0.42142248 0.41706698 0.43847856
|
|
0.4590301 0.34027251 0.39495657 0.39552265]
|
|
|
|
mean value: 0.41079827922829903
|
|
|
|
key: test_accuracy
|
|
value: [0.85 0.8625 0.8875 0.87341772 0.86075949 0.86075949
|
|
0.84810127 0.86075949 0.81012658 0.87341772]
|
|
|
|
mean value: 0.8587341772151899
|
|
|
|
key: train_accuracy
|
|
value: [0.89200561 0.88639551 0.88499299 0.88795518 0.88795518 0.8907563
|
|
0.89355742 0.87815126 0.88515406 0.88515406]
|
|
|
|
mean value: 0.887207758278627
|
|
|
|
key: test_fscore
|
|
value: [0.14285714 0.15384615 0.4 0.16666667 0.15384615 0.15384615
|
|
0.14285714 0. 0. 0.16666667]
|
|
|
|
mean value: 0.14805860805860807
|
|
|
|
key: train_fscore
|
|
value: [0.39370079 0.36220472 0.32786885 0.40298507 0.36507937 0.4
|
|
0.40625 0.304 0.33870968 0.34920635]
|
|
|
|
mean value: 0.3650004830601975
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.5 1. 1. 0.5 0.5
|
|
0.33333333 0. 0. 1. ]
|
|
|
|
mean value: 0.5166666666666666
|
|
|
|
key: train_precision
|
|
value: [0.92592593 0.85185185 0.86956522 0.79411765 0.88461538 0.86666667
|
|
0.92857143 0.76 0.875 0.84615385]
|
|
|
|
mean value: 0.8602467968235231
|
|
|
|
key: test_recall
|
|
value: [0.09090909 0.09090909 0.25 0.09090909 0.09090909 0.09090909
|
|
0.09090909 0. 0. 0.09090909]
|
|
|
|
mean value: 0.08863636363636364
|
|
|
|
key: train_recall
|
|
value: [0.25 0.23 0.2020202 0.27 0.23 0.26 0.26
|
|
0.19 0.21 0.22 ]
|
|
|
|
mean value: 0.23220202020202022
|
|
|
|
key: test_roc_auc
|
|
value: [0.53096179 0.53820817 0.625 0.54545455 0.5381016 0.5381016
|
|
0.53074866 0.5 0.47058824 0.54545455]
|
|
|
|
mean value: 0.5362619158335271
|
|
|
|
key: train_roc_auc
|
|
value: [0.62336868 0.61173736 0.5985671 0.62929967 0.612557 0.62674267
|
|
0.62837134 0.59011401 0.602557 0.60674267]
|
|
|
|
mean value: 0.6130057504977346
|
|
|
|
key: test_jcc
|
|
value: [0.07692308 0.08333333 0.25 0.09090909 0.08333333 0.08333333
|
|
0.07692308 0. 0. 0.09090909]
|
|
|
|
mean value: 0.08356643356643356
|
|
|
|
key: train_jcc
|
|
value: [0.24509804 0.22115385 0.19607843 0.25233645 0.22330097 0.25
|
|
0.25490196 0.17924528 0.2038835 0.21153846]
|
|
|
|
mean value: 0.2237536936701273
|
|
|
|
MCC on Blind test: -0.09
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03786564 0.03764153 0.07014346 0.03748584 0.03807282 0.03875828
|
|
0.02647471 0.02575278 0.02461553 0.02594876]
|
|
|
|
mean value: 0.036275935173034665
|
|
|
|
key: score_time
|
|
value: [0.01906037 0.01942945 0.01911497 0.01885724 0.01908636 0.03045058
|
|
0.01306677 0.01287198 0.01263452 0.01324749]
|
|
|
|
mean value: 0.017781972885131836
|
|
|
|
key: test_mcc
|
|
value: [0.28178291 0.34676496 0.26782449 0. 0. 0.49398293
|
|
0.16794369 0.28152101 0.11138831 0. ]
|
|
|
|
mean value: 0.1951208310227963
|
|
|
|
key: train_mcc
|
|
value: [0.47977517 0.47977517 0.3499354 0.50061187 0.40746174 0.39619069
|
|
0.4700762 0.41846368 0.52519566 0.47982099]
|
|
|
|
mean value: 0.45073065750677455
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.8625 0.86075949 0.86075949 0.89873418
|
|
0.86075949 0.87341772 0.84810127 0.86075949]
|
|
|
|
mean value: 0.8675791139240506
|
|
|
|
key: train_accuracy
|
|
value: [0.89621318 0.89621318 0.88078541 0.89915966 0.88655462 0.88515406
|
|
0.89495798 0.88795518 0.90336134 0.89635854]
|
|
|
|
mean value: 0.8926713181766395
|
|
|
|
key: test_fscore
|
|
value: [0.16666667 0.375 0.15384615 0. 0. 0.42857143
|
|
0.15384615 0.16666667 0.14285714 0. ]
|
|
|
|
mean value: 0.15874542124542124
|
|
|
|
key: train_fscore
|
|
value: [0.421875 0.421875 0.26086957 0.4375 0.33057851 0.31666667
|
|
0.40944882 0.3442623 0.48888889 0.421875 ]
|
|
|
|
mean value: 0.38538397471492464
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 1. 0. 0. 1.
|
|
0.5 1. 0.33333333 0. ]
|
|
|
|
mean value: 0.5433333333333333
|
|
|
|
key: train_precision
|
|
value: [0.96428571 0.96428571 0.9375 1. 0.95238095 0.95
|
|
0.96296296 0.95454545 0.94285714 0.96428571]
|
|
|
|
mean value: 0.9593103655603655
|
|
|
|
key: test_recall
|
|
value: [0.09090909 0.27272727 0.08333333 0. 0. 0.27272727
|
|
0.09090909 0.09090909 0.09090909 0. ]
|
|
|
|
mean value: 0.09924242424242424
|
|
|
|
key: train_recall
|
|
value: [0.27 0.27 0.15151515 0.28 0.2 0.19
|
|
0.26 0.21 0.33 0.27 ]
|
|
|
|
mean value: 0.24315151515151517
|
|
|
|
key: test_roc_auc
|
|
value: [0.54545455 0.62187088 0.54166667 0.5 0.5 0.63636364
|
|
0.5381016 0.54545455 0.53074866 0.5 ]
|
|
|
|
mean value: 0.5459660544059521
|
|
|
|
key: train_roc_auc
|
|
value: [0.63418434 0.63418434 0.57494324 0.64 0.59918567 0.59418567
|
|
0.62918567 0.60418567 0.66337134 0.63418567]
|
|
|
|
mean value: 0.620761159640681
|
|
|
|
key: test_jcc
|
|
value: [0.09090909 0.23076923 0.08333333 0. 0. 0.27272727
|
|
0.08333333 0.09090909 0.07692308 0. ]
|
|
|
|
mean value: 0.09289044289044289
|
|
|
|
key: train_jcc
|
|
value: [0.26732673 0.26732673 0.15 0.28 0.1980198 0.18811881
|
|
0.25742574 0.20792079 0.32352941 0.26732673]
|
|
|
|
mean value: 0.24069947582993595
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [4.76756167 5.27475548 5.63623118 5.40502548 6.04161572 5.8417573
|
|
6.6231823 6.51457071 5.34702921 3.33198285]
|
|
|
|
mean value: 5.47837119102478
|
|
|
|
key: score_time
|
|
value: [0.01546693 0.0225172 0.01796293 0.02164364 0.04560542 0.0318253
|
|
0.03422785 0.02604556 0.01566219 0.02625656]
|
|
|
|
mean value: 0.025721359252929687
|
|
|
|
key: test_mcc
|
|
value: [0.43754361 0.51286858 0.61325296 0.36631016 0.66135521 0.59219173
|
|
0.43119194 0.43676935 0.54627358 0.71339159]
|
|
|
|
mean value: 0.5311148709191367
|
|
|
|
key: train_mcc
|
|
value: [0.95884344 0.98833851 0.97045246 0.98250043 0.97661349 0.97071011
|
|
0.98837134 0.95893173 0.95893173 0.97073494]
|
|
|
|
mean value: 0.9724428173839278
|
|
|
|
key: test_accuracy
|
|
value: [0.8875 0.875 0.9125 0.84810127 0.92405063 0.91139241
|
|
0.87341772 0.88607595 0.89873418 0.93670886]
|
|
|
|
mean value: 0.8953481012658228
|
|
|
|
key: train_accuracy
|
|
value: [0.99018233 0.99719495 0.99298738 0.99579832 0.99439776 0.9929972
|
|
0.99719888 0.99019608 0.99019608 0.9929972 ]
|
|
|
|
mean value: 0.9934146168986528
|
|
|
|
key: test_fscore
|
|
value: [0.47058824 0.58333333 0.63157895 0.45454545 0.7 0.63157895
|
|
0.5 0.47058824 0.6 0.73684211]
|
|
|
|
mean value: 0.5779055258467023
|
|
|
|
key: train_fscore
|
|
value: [0.96410256 0.98989899 0.97435897 0.98492462 0.97979798 0.97461929
|
|
0.99 0.96446701 0.96446701 0.97435897]
|
|
|
|
mean value: 0.9760995405125446
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.53846154 0.85714286 0.45454545 0.77777778 0.75
|
|
0.55555556 0.66666667 0.66666667 0.875 ]
|
|
|
|
mean value: 0.6808483183483184
|
|
|
|
key: train_precision
|
|
value: [0.98947368 1. 0.98958333 0.98989899 0.98979592 0.98969072
|
|
0.99 0.97938144 0.97938144 1. ]
|
|
|
|
mean value: 0.9897205534057619
|
|
|
|
key: test_recall
|
|
value: [0.36363636 0.63636364 0.5 0.45454545 0.63636364 0.54545455
|
|
0.45454545 0.36363636 0.54545455 0.63636364]
|
|
|
|
mean value: 0.5136363636363637
|
|
|
|
key: train_recall
|
|
value: [0.94 0.98 0.95959596 0.98 0.97 0.96
|
|
0.99 0.95 0.95 0.95 ]
|
|
|
|
mean value: 0.9629595959595959
|
|
|
|
key: test_roc_auc
|
|
value: [0.66732543 0.77470356 0.74264706 0.68315508 0.80347594 0.75802139
|
|
0.69786096 0.6671123 0.75066845 0.81082888]
|
|
|
|
mean value: 0.7355799038983183
|
|
|
|
key: train_roc_auc
|
|
value: [0.96918434 0.99 0.97898365 0.98918567 0.98418567 0.97918567
|
|
0.99418567 0.97337134 0.97337134 0.975 ]
|
|
|
|
mean value: 0.9806653328884811
|
|
|
|
key: test_jcc
|
|
value: [0.30769231 0.41176471 0.46153846 0.29411765 0.53846154 0.46153846
|
|
0.33333333 0.30769231 0.42857143 0.58333333]
|
|
|
|
mean value: 0.41280435251023484
|
|
|
|
key: train_jcc
|
|
value: [0.93069307 0.98 0.95 0.97029703 0.96039604 0.95049505
|
|
0.98019802 0.93137255 0.93137255 0.95 ]
|
|
|
|
mean value: 0.9534824305960008
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04919791 0.04214954 0.03464055 0.04058719 0.03853202 0.03963971
|
|
0.0449214 0.04226875 0.04017758 0.04114819]
|
|
|
|
mean value: 0.041326284408569336
|
|
|
|
key: score_time
|
|
value: [0.01094723 0.00959659 0.00930953 0.00933218 0.01026773 0.00946903
|
|
0.00933957 0.0098052 0.00962543 0.00986981]
|
|
|
|
mean value: 0.009756231307983398
|
|
|
|
key: test_mcc
|
|
value: [0.51864618 0.55216696 0.64550223 0.57937053 0.57937053 0.77643684
|
|
0.31611031 0.7090125 0.43119194 0.48361682]
|
|
|
|
mean value: 0.5591424848620753
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.875 0.9125 0.88607595 0.88607595 0.94936709
|
|
0.84810127 0.92405063 0.87341772 0.86075949]
|
|
|
|
mean value: 0.8915348101265823
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.61538462 0.69565217 0.64 0.64 0.8
|
|
0.4 0.75 0.5 0.56 ]
|
|
|
|
mean value: 0.6156592344853214
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.53333333 0.72727273 0.57142857 0.57142857 0.88888889
|
|
0.44444444 0.69230769 0.55555556 0.5 ]
|
|
|
|
mean value: 0.6198945498945498
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.45454545 0.72727273 0.66666667 0.72727273 0.72727273 0.72727273
|
|
0.36363636 0.81818182 0.45454545 0.63636364]
|
|
|
|
mean value: 0.6303030303030304
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71277997 0.81291173 0.81127451 0.81951872 0.81951872 0.85628342
|
|
0.64505348 0.87967914 0.69786096 0.76671123]
|
|
|
|
mean value: 0.7821591877857863
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.44444444 0.53333333 0.47058824 0.47058824 0.66666667
|
|
0.25 0.6 0.33333333 0.38888889]
|
|
|
|
mean value: 0.4542458521870286
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17240572 0.16647243 0.16662502 0.16827416 0.15932035 0.15240288
|
|
0.14529085 0.15759325 0.16292405 0.15607858]
|
|
|
|
mean value: 0.16073873043060302
|
|
|
|
key: score_time
|
|
value: [0.02106047 0.02116942 0.02069497 0.02151155 0.02009916 0.02284741
|
|
0.01929522 0.02032804 0.02048707 0.02063346]
|
|
|
|
mean value: 0.020812678337097167
|
|
|
|
key: test_mcc
|
|
value: [0.49436016 0.22988544 0.54611868 0.30268562 0.30268562 0.40742332
|
|
0.43676935 0.40742332 0.07388506 0.57419245]
|
|
|
|
mean value: 0.37754290201872565
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8375 0.9 0.87341772 0.87341772 0.88607595
|
|
0.88607595 0.88607595 0.83544304 0.91139241]
|
|
|
|
mean value: 0.8789398734177215
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.42857143 0.31578947 0.5 0.28571429 0.28571429 0.4
|
|
0.47058824 0.4 0.13333333 0.53333333]
|
|
|
|
mean value: 0.3753044375644995
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.375 1. 0.66666667 0.66666667 0.75
|
|
0.66666667 0.75 0.25 1. ]
|
|
|
|
mean value: 0.7125
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.27272727 0.27272727 0.33333333 0.18181818 0.18181818 0.27272727
|
|
0.36363636 0.27272727 0.09090909 0.36363636]
|
|
|
|
mean value: 0.2606060606060606
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63636364 0.60013175 0.66666667 0.58355615 0.58355615 0.6290107
|
|
0.6671123 0.6290107 0.52339572 0.68181818]
|
|
|
|
mean value: 0.6200621948384097
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.27272727 0.1875 0.33333333 0.16666667 0.16666667 0.25
|
|
0.30769231 0.25 0.07142857 0.36363636]
|
|
|
|
mean value: 0.2369651182151182
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01228523 0.01270628 0.01270533 0.01234698 0.01202917 0.01263547
|
|
0.01246166 0.01153898 0.01213789 0.01210713]
|
|
|
|
mean value: 0.012295413017272949
|
|
|
|
key: score_time
|
|
value: [0.00941229 0.0091939 0.00986099 0.00935078 0.00958705 0.00989294
|
|
0.00903296 0.00943208 0.00941062 0.00903535]
|
|
|
|
mean value: 0.00942089557647705
|
|
|
|
key: test_mcc
|
|
value: [0.17835014 0.23888665 0.18280094 0.28674237 0.21594923 0.36631016
|
|
0.23726791 0.36631016 0.27141973 0.08594704]
|
|
|
|
mean value: 0.2429984334695301
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.8125 0.8125 0.83544304 0.79746835 0.84810127
|
|
0.81012658 0.84810127 0.79746835 0.79746835]
|
|
|
|
mean value: 0.8171677215189873
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.28571429 0.34782609 0.28571429 0.38095238 0.33333333 0.45454545
|
|
0.34782609 0.45454545 0.38461538 0.2 ]
|
|
|
|
mean value: 0.3475072753333623
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.3 0.33333333 0.33333333 0.4 0.30769231 0.45454545
|
|
0.33333333 0.45454545 0.33333333 0.22222222]
|
|
|
|
mean value: 0.3472338772338772
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.27272727 0.36363636 0.25 0.36363636 0.36363636 0.45454545
|
|
0.36363636 0.45454545 0.45454545 0.18181818]
|
|
|
|
mean value: 0.3522727272727273
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.585639 0.62384717 0.58088235 0.63770053 0.61564171 0.68315508
|
|
0.62299465 0.68315508 0.65374332 0.5394385 ]
|
|
|
|
mean value: 0.6226197395954429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.16666667 0.21052632 0.16666667 0.23529412 0.2 0.29411765
|
|
0.21052632 0.29411765 0.23809524 0.11111111]
|
|
|
|
mean value: 0.21271217258833358
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.46597672 2.42714763 2.48374224 2.4591248 2.46944237 2.50179291
|
|
2.38671589 2.47136688 2.56136847 2.70485163]
|
|
|
|
mean value: 2.493152952194214
|
|
|
|
key: score_time
|
|
value: [0.10462785 0.09992313 0.09720373 0.10394239 0.10339022 0.10226011
|
|
0.10427213 0.0963974 0.10097837 0.09991646]
|
|
|
|
mean value: 0.10129117965698242
|
|
|
|
key: test_mcc
|
|
value: [0.40104758 0.51864618 0.73714245 0.49398293 0.40742332 0.64628973
|
|
0.59219173 0.64628973 0.34595509 0.64658323]
|
|
|
|
mean value: 0.5435551986075632
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.8875 0.9 0.9375 0.89873418 0.88607595 0.92405063
|
|
0.91139241 0.92405063 0.87341772 0.92405063]
|
|
|
|
mean value: 0.9066772151898734
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.30769231 0.55555556 0.73684211 0.42857143 0.4 0.625
|
|
0.63157895 0.625 0.375 0.66666667]
|
|
|
|
mean value: 0.5351907011117537
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.71428571 1. 1. 0.75 1.
|
|
0.75 1. 0.6 0.85714286]
|
|
|
|
mean value: 0.8671428571428571
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.18181818 0.45454545 0.58333333 0.27272727 0.27272727 0.45454545
|
|
0.54545455 0.45454545 0.27272727 0.54545455]
|
|
|
|
mean value: 0.40378787878787875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.59090909 0.71277997 0.79166667 0.63636364 0.6290107 0.72727273
|
|
0.75802139 0.72727273 0.62165775 0.76537433]
|
|
|
|
mean value: 0.6960328993257382
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.18181818 0.38461538 0.58333333 0.27272727 0.25 0.45454545
|
|
0.46153846 0.45454545 0.23076923 0.5 ]
|
|
|
|
mean value: 0.3773892773892774
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [2.23748446 1.20426846 1.20782495 1.06972313 1.0778296 1.18399477
|
|
1.38125014 1.35519075 1.36815524 1.3413291 ]
|
|
|
|
mean value: 1.3427050590515137
|
|
|
|
key: score_time
|
|
value: [0.2090075 0.18635893 0.19955277 0.20219612 0.1430521 0.22746968
|
|
0.17598581 0.18938136 0.17117977 0.23886013]
|
|
|
|
mean value: 0.19430441856384278
|
|
|
|
key: test_mcc
|
|
value: [0.28178291 0.34676496 0.6146363 0.40070776 0.40070776 0.64628973
|
|
0.43676935 0.49398293 0.11138831 0.49612241]
|
|
|
|
mean value: 0.42291524210944176
|
|
|
|
key: train_mcc
|
|
value: [0.80953731 0.81042317 0.79443166 0.78313752 0.78313752 0.77644144
|
|
0.77644144 0.8164158 0.78313752 0.78979757]
|
|
|
|
mean value: 0.7922900961329172
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.9125 0.88607595 0.88607595 0.92405063
|
|
0.88607595 0.89873418 0.84810127 0.89873418]
|
|
|
|
mean value: 0.8890348101265823
|
|
|
|
key: train_accuracy
|
|
value: [0.95652174 0.95652174 0.95371669 0.95098039 0.95098039 0.94957983
|
|
0.94957983 0.95798319 0.95098039 0.95238095]
|
|
|
|
mean value: 0.9529225154297343
|
|
|
|
key: test_fscore
|
|
value: [0.16666667 0.375 0.58823529 0.30769231 0.30769231 0.625
|
|
0.47058824 0.42857143 0.14285714 0.5 ]
|
|
|
|
mean value: 0.3912303382891618
|
|
|
|
key: train_fscore
|
|
value: [0.82080925 0.81656805 0.80473373 0.79289941 0.79289941 0.78571429
|
|
0.78571429 0.8255814 0.79289941 0.8 ]
|
|
|
|
mean value: 0.8017819215332322
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 1. 1. 1. 1.
|
|
0.66666667 1. 0.33333333 0.8 ]
|
|
|
|
mean value: 0.84
|
|
|
|
key: train_precision
|
|
value: [0.97260274 1. 0.97142857 0.97101449 0.97101449 0.97058824
|
|
0.97058824 0.98611111 0.97101449 0.97142857]
|
|
|
|
mean value: 0.9755790942543386
|
|
|
|
key: test_recall
|
|
value: [0.09090909 0.27272727 0.41666667 0.18181818 0.18181818 0.45454545
|
|
0.36363636 0.27272727 0.09090909 0.36363636]
|
|
|
|
mean value: 0.2689393939393939
|
|
|
|
key: train_recall
|
|
value: [0.71 0.69 0.68686869 0.67 0.67 0.66
|
|
0.66 0.71 0.67 0.68 ]
|
|
|
|
mean value: 0.6806868686868687
|
|
|
|
key: test_roc_auc
|
|
value: [0.54545455 0.62187088 0.70833333 0.59090909 0.59090909 0.72727273
|
|
0.6671123 0.63636364 0.53074866 0.67446524]
|
|
|
|
mean value: 0.6293439510191429
|
|
|
|
key: train_roc_auc
|
|
value: [0.85336868 0.845 0.84180568 0.83337134 0.83337134 0.82837134
|
|
0.82837134 0.85418567 0.83337134 0.83837134]
|
|
|
|
mean value: 0.8389588038350678
|
|
|
|
key: test_jcc
|
|
value: [0.09090909 0.23076923 0.41666667 0.18181818 0.18181818 0.45454545
|
|
0.30769231 0.27272727 0.07692308 0.33333333]
|
|
|
|
mean value: 0.2547202797202797
|
|
|
|
key: train_jcc
|
|
value: [0.69607843 0.69 0.67326733 0.65686275 0.65686275 0.64705882
|
|
0.64705882 0.7029703 0.65686275 0.66666667]
|
|
|
|
mean value: 0.6693688604154533
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0197196 0.03598475 0.03622699 0.03120279 0.01471877 0.03621674
|
|
0.02270555 0.03164887 0.03560638 0.04625273]
|
|
|
|
mean value: 0.031028318405151366
|
|
|
|
key: score_time
|
|
value: [0.01119256 0.02131534 0.09772825 0.01156521 0.011374 0.04016137
|
|
0.01889348 0.02915859 0.02790117 0.02027225]
|
|
|
|
mean value: 0.028956222534179687
|
|
|
|
key: test_mcc
|
|
value: [-0.10309229 0.22988544 0.38549554 0.06681376 0.71280758 0.16073112
|
|
0.38925288 0.24065419 0.00325735 0.43676935]
|
|
|
|
mean value: 0.25225749117040186
|
|
|
|
key: train_mcc
|
|
value: [0.34313123 0.31282773 0.31079126 0.36258434 0.28090475 0.33865482
|
|
0.38265799 0.36908667 0.32684012 0.30718987]
|
|
|
|
mean value: 0.33346687688314813
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.8375 0.875 0.78481013 0.93670886 0.83544304
|
|
0.87341772 0.86075949 0.79746835 0.88607595]
|
|
|
|
mean value: 0.8487183544303798
|
|
|
|
key: train_accuracy
|
|
value: [0.86535764 0.86115007 0.86115007 0.8697479 0.85294118 0.86414566
|
|
0.86834734 0.86554622 0.85854342 0.8557423 ]
|
|
|
|
mean value: 0.8622671789613461
|
|
|
|
key: test_fscore
|
|
value: [0. 0.31578947 0.375 0.19047619 0.70588235 0.23529412
|
|
0.44444444 0.26666667 0.11111111 0.47058824]
|
|
|
|
mean value: 0.3115252592264976
|
|
|
|
key: train_fscore
|
|
value: [0.4 0.36942675 0.36942675 0.41509434 0.34782609 0.39751553
|
|
0.44705882 0.43529412 0.39520958 0.37575758]
|
|
|
|
mean value: 0.3952609555486557
|
|
|
|
key: test_precision
|
|
value: [0. 0.375 0.75 0.2 1. 0.33333333
|
|
0.57142857 0.5 0.14285714 0.66666667]
|
|
|
|
mean value: 0.4539285714285714
|
|
|
|
key: train_precision
|
|
value: [0.53333333 0.50877193 0.5 0.55932203 0.45901639 0.52459016
|
|
0.54285714 0.52857143 0.49253731 0.47692308]
|
|
|
|
mean value: 0.5125922816217733
|
|
|
|
key: test_recall
|
|
value: [0. 0.27272727 0.25 0.18181818 0.54545455 0.18181818
|
|
0.36363636 0.18181818 0.09090909 0.36363636]
|
|
|
|
mean value: 0.24318181818181817
|
|
|
|
key: train_recall
|
|
value: [0.32 0.29 0.29292929 0.33 0.28 0.32
|
|
0.38 0.37 0.33 0.31 ]
|
|
|
|
mean value: 0.3222929292929293
|
|
|
|
key: test_roc_auc
|
|
value: [0.46376812 0.60013175 0.61764706 0.53208556 0.77272727 0.56149733
|
|
0.65975936 0.57620321 0.5013369 0.6671123 ]
|
|
|
|
mean value: 0.5952268852204914
|
|
|
|
key: train_roc_auc
|
|
value: [0.6371615 0.6221615 0.62284901 0.64382736 0.61312704 0.63638436
|
|
0.66394137 0.65812704 0.6373127 0.6273127 ]
|
|
|
|
mean value: 0.6362204586206717
|
|
|
|
key: test_jcc
|
|
value: [0. 0.1875 0.23076923 0.10526316 0.54545455 0.13333333
|
|
0.28571429 0.15384615 0.05882353 0.30769231]
|
|
|
|
mean value: 0.20083965441163584
|
|
|
|
key: train_jcc
|
|
value: [0.25 0.2265625 0.2265625 0.26190476 0.21052632 0.24806202
|
|
0.28787879 0.27819549 0.24626866 0.23134328]
|
|
|
|
mean value: 0.24673043100972114
|
|
|
|
MCC on Blind test: 0.03
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [1.20054126 3.52299976 2.68894696 2.35090446 0.53907275 7.61027193
|
|
2.27476096 2.3721981 2.27101231 2.30523801]
|
|
|
|
mean value: 2.713594651222229
|
|
|
|
key: score_time
|
|
value: [0.01278949 0.02615476 0.01358294 0.01904464 0.01230979 0.01299477
|
|
0.01300144 0.0128026 0.01257706 0.01270843]
|
|
|
|
mean value: 0.014796590805053711
|
|
|
|
key: test_mcc
|
|
value: [0.64666979 0.57839262 0.74715612 0.77643684 0.72659961 0.83459145
|
|
0.68315508 0.78877005 0.54287929 0.77643684]
|
|
|
|
mean value: 0.7101087702417325
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.925 0.9 0.9375 0.94936709 0.93670886 0.96202532
|
|
0.92405063 0.94936709 0.88607595 0.94936709]
|
|
|
|
mean value: 0.9319462025316456
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.63636364 0.7826087 0.8 0.76190476 0.84210526
|
|
0.72727273 0.81818182 0.60869565 0.8 ]
|
|
|
|
mean value: 0.7402132554706925
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.63636364 0.81818182 0.88888889 0.8 1.
|
|
0.72727273 0.81818182 0.58333333 0.88888889]
|
|
|
|
mean value: 0.8161111111111111
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.45454545 0.63636364 0.75 0.72727273 0.72727273 0.72727273
|
|
0.72727273 0.81818182 0.63636364 0.72727273]
|
|
|
|
mean value: 0.6931818181818182
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.72727273 0.78919631 0.86029412 0.85628342 0.84893048 0.86363636
|
|
0.84157754 0.89438503 0.78141711 0.85628342]
|
|
|
|
mean value: 0.8319276524839185
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.46666667 0.64285714 0.66666667 0.61538462 0.72727273
|
|
0.57142857 0.69230769 0.4375 0.66666667]
|
|
|
|
mean value: 0.5941296203796204
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.17741966 0.09170365 0.13291836 0.10037684 0.14957428 0.16964102
|
|
0.12517238 0.10618401 0.09255886 0.09366941]
|
|
|
|
mean value: 0.12392184734344483
|
|
|
|
key: score_time
|
|
value: [0.03746223 0.0251298 0.03144145 0.04300594 0.02535176 0.02620196
|
|
0.02602458 0.03189564 0.01941037 0.03404975]
|
|
|
|
mean value: 0.02999734878540039
|
|
|
|
key: test_mcc
|
|
value: [0.49671738 0.44219444 0.74715612 0.61039985 0.52515049 0.64432685
|
|
0.50667099 0.51791806 0.47192513 0.54627358]
|
|
|
|
mean value: 0.5508732880524055
|
|
|
|
key: train_mcc
|
|
value: [0.79768654 0.79768654 0.78545027 0.79773102 0.77082394 0.76944991
|
|
0.79638654 0.81288529 0.81288529 0.79249782]
|
|
|
|
mean value: 0.7933483163681401
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8625 0.9375 0.89873418 0.86075949 0.89873418
|
|
0.88607595 0.89873418 0.87341772 0.89873418]
|
|
|
|
mean value: 0.8915189873417722
|
|
|
|
key: train_accuracy
|
|
value: [0.95231417 0.95231417 0.94950912 0.95238095 0.94677871 0.94677871
|
|
0.95238095 0.95658263 0.95658263 0.95098039]
|
|
|
|
mean value: 0.9516602433399727
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.52173913 0.7826087 0.66666667 0.59259259 0.69230769
|
|
0.57142857 0.55555556 0.54545455 0.6 ]
|
|
|
|
mean value: 0.6028353450092581
|
|
|
|
key: train_fscore
|
|
value: [0.82474227 0.82474227 0.81443299 0.82474227 0.8 0.79787234
|
|
0.82291667 0.83597884 0.83597884 0.82051282]
|
|
|
|
mean value: 0.8201919293377125
|
|
|
|
key: test_precision
|
|
value: [0.8 0.5 0.81818182 0.61538462 0.5 0.6
|
|
0.6 0.71428571 0.54545455 0.66666667]
|
|
|
|
mean value: 0.635997335997336
|
|
|
|
key: train_precision
|
|
value: [0.85106383 0.85106383 0.83157895 0.85106383 0.84444444 0.85227273
|
|
0.85869565 0.88764045 0.88764045 0.84210526]
|
|
|
|
mean value: 0.8557569422655507
|
|
|
|
key: test_recall
|
|
value: [0.36363636 0.54545455 0.75 0.72727273 0.72727273 0.81818182
|
|
0.54545455 0.45454545 0.54545455 0.54545455]
|
|
|
|
mean value: 0.6022727272727273
|
|
|
|
key: train_recall
|
|
value: [0.8 0.8 0.7979798 0.8 0.76 0.75 0.79
|
|
0.79 0.79 0.8 ]
|
|
|
|
mean value: 0.7877979797979798
|
|
|
|
key: test_roc_auc
|
|
value: [0.67457181 0.72924901 0.86029412 0.82687166 0.80481283 0.86497326
|
|
0.74331551 0.71256684 0.73596257 0.75066845]
|
|
|
|
mean value: 0.7703286057506007
|
|
|
|
key: train_roc_auc
|
|
value: [0.88858075 0.88858075 0.88596058 0.88859935 0.86859935 0.86441368
|
|
0.88441368 0.88685668 0.88685668 0.88778502]
|
|
|
|
mean value: 0.8830646513812075
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.35294118 0.64285714 0.5 0.42105263 0.52941176
|
|
0.4 0.38461538 0.375 0.42857143]
|
|
|
|
mean value: 0.43677828621327075
|
|
|
|
key: train_jcc
|
|
value: [0.70175439 0.70175439 0.68695652 0.70175439 0.66666667 0.66371681
|
|
0.69911504 0.71818182 0.71818182 0.69565217]
|
|
|
|
mean value: 0.6953734014984293
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.05694079 0.06395292 0.07268739 0.05766678 0.03244686 0.039891
|
|
0.04338455 0.04891658 0.05110669 0.06775498]
|
|
|
|
mean value: 0.05347485542297363
|
|
|
|
key: score_time
|
|
value: [0.02860403 0.03798938 0.03443909 0.03022146 0.05056453 0.05091286
|
|
0.02191162 0.01679182 0.03358102 0.03731632]
|
|
|
|
mean value: 0.0342332124710083
|
|
|
|
key: test_mcc
|
|
value: [0.3033031 0.47299078 0.54492569 0.2605877 0.64658323 0.59219173
|
|
0.59219173 0.40742332 0.34979201 0.51791806]
|
|
|
|
mean value: 0.46879073368999474
|
|
|
|
key: train_mcc
|
|
value: [0.52799211 0.54819448 0.49772617 0.50359962 0.47984635 0.50675531
|
|
0.49820394 0.50147904 0.54318658 0.49630465]
|
|
|
|
mean value: 0.5103288250765816
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.9 0.84810127 0.92405063 0.91139241
|
|
0.91139241 0.88607595 0.86075949 0.89873418]
|
|
|
|
mean value: 0.8890506329113924
|
|
|
|
key: train_accuracy
|
|
value: [0.89621318 0.90322581 0.89621318 0.89495798 0.89215686 0.89495798
|
|
0.89355742 0.89355742 0.90336134 0.89215686]
|
|
|
|
mean value: 0.8960358056265985
|
|
|
|
key: test_fscore
|
|
value: [0.28571429 0.54545455 0.55555556 0.33333333 0.66666667 0.63157895
|
|
0.63157895 0.4 0.42105263 0.55555556]
|
|
|
|
mean value: 0.5026490468595731
|
|
|
|
key: train_fscore
|
|
value: [0.57954545 0.58682635 0.53164557 0.54545455 0.51572327 0.5508982
|
|
0.54216867 0.54761905 0.57668712 0.5443787 ]
|
|
|
|
mean value: 0.5520946928065821
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.54545455 0.83333333 0.42857143 0.85714286 0.75
|
|
0.75 0.75 0.5 0.71428571]
|
|
|
|
mean value: 0.6795454545454546
|
|
|
|
key: train_precision
|
|
value: [0.67105263 0.73134328 0.71186441 0.69230769 0.69491525 0.68656716
|
|
0.68181818 0.67647059 0.74603175 0.66666667]
|
|
|
|
mean value: 0.6959037615416671
|
|
|
|
key: test_recall
|
|
value: [0.18181818 0.54545455 0.41666667 0.27272727 0.54545455 0.54545455
|
|
0.54545455 0.27272727 0.36363636 0.45454545]
|
|
|
|
mean value: 0.41439393939393937
|
|
|
|
key: train_recall
|
|
value: [0.51 0.49 0.42424242 0.45 0.41 0.46
|
|
0.45 0.46 0.47 0.46 ]
|
|
|
|
mean value: 0.4584242424242424
|
|
|
|
key: test_roc_auc
|
|
value: [0.58366271 0.73649539 0.70098039 0.60695187 0.76537433 0.75802139
|
|
0.75802139 0.6290107 0.65240642 0.71256684]
|
|
|
|
mean value: 0.6903491436100132
|
|
|
|
key: train_roc_auc
|
|
value: [0.73460848 0.73031811 0.69827756 0.70871336 0.69034202 0.71289902
|
|
0.70789902 0.71208469 0.72197068 0.71127036]
|
|
|
|
mean value: 0.7128383307545542
|
|
|
|
key: test_jcc
|
|
value: [0.16666667 0.375 0.38461538 0.2 0.5 0.46153846
|
|
0.46153846 0.25 0.26666667 0.38461538]
|
|
|
|
mean value: 0.3450641025641026
|
|
|
|
key: train_jcc
|
|
value: [0.408 0.41525424 0.36206897 0.375 0.34745763 0.38016529
|
|
0.37190083 0.37704918 0.40517241 0.37398374]
|
|
|
|
mean value: 0.3816052279584871
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03733659 0.11061597 0.06538081 0.06376076 0.06266046 0.06085181
|
|
0.07293844 0.06197333 0.06072569 0.04202867]
|
|
|
|
mean value: 0.0638272523880005
|
|
|
|
key: score_time
|
|
value: [0.01605201 0.05469155 0.03833151 0.080966 0.02012324 0.04428625
|
|
0.04422903 0.04278708 0.03952503 0.04887033]
|
|
|
|
mean value: 0.04298620223999024
|
|
|
|
key: test_mcc
|
|
value: [0.45953084 0.28498089 0.72784016 0.49612241 0.54709854 0.6166353
|
|
0.41220189 0.40070776 0.2605877 0.30268562]
|
|
|
|
mean value: 0.4508391113755412
|
|
|
|
key: train_mcc
|
|
value: [0.62022622 0.3410488 0.65083284 0.6562802 0.58642176 0.6405191
|
|
0.66211276 0.69245266 0.66378948 0.56951461]
|
|
|
|
mean value: 0.6083198435919134
|
|
|
|
key: test_accuracy
|
|
value: [0.825 0.475 0.925 0.89873418 0.82278481 0.91139241
|
|
0.82278481 0.88607595 0.84810127 0.87341772]
|
|
|
|
mean value: 0.8288291139240507
|
|
|
|
key: train_accuracy
|
|
value: [0.86255259 0.56521739 0.88920056 0.92577031 0.82913165 0.92296919
|
|
0.89915966 0.93277311 0.92717087 0.91036415]
|
|
|
|
mean value: 0.8664309482558802
|
|
|
|
key: test_fscore
|
|
value: [0.53333333 0.34375 0.76923077 0.5 0.58823529 0.66666667
|
|
0.5 0.30769231 0.33333333 0.28571429]
|
|
|
|
mean value: 0.4827955990088343
|
|
|
|
key: train_fscore
|
|
value: [0.65492958 0.38976378 0.69019608 0.67080745 0.61392405 0.64968153
|
|
0.70491803 0.7037037 0.67901235 0.53623188]
|
|
|
|
mean value: 0.6293168434362774
|
|
|
|
key: test_precision
|
|
value: [0.42105263 0.20754717 0.71428571 0.8 0.43478261 0.7
|
|
0.41176471 1. 0.42857143 0.66666667]
|
|
|
|
mean value: 0.5784670925492083
|
|
|
|
key: train_precision
|
|
value: [0.50543478 0.24264706 0.56410256 0.8852459 0.44907407 0.89473684
|
|
0.59722222 0.91935484 0.88709677 0.97368421]
|
|
|
|
mean value: 0.6918599269005234
|
|
|
|
key: test_recall
|
|
value: [0.72727273 1. 0.83333333 0.36363636 0.90909091 0.63636364
|
|
0.63636364 0.18181818 0.27272727 0.18181818]
|
|
|
|
mean value: 0.5742424242424242
|
|
|
|
key: train_recall
|
|
value: [0.93 0.99 0.88888889 0.54 0.97 0.51
|
|
0.86 0.57 0.55 0.37 ]
|
|
|
|
mean value: 0.7178888888888889
|
|
|
|
key: test_roc_auc
|
|
value: [0.78392622 0.69565217 0.8872549 0.67446524 0.85895722 0.79612299
|
|
0.74465241 0.59090909 0.60695187 0.58355615]
|
|
|
|
mean value: 0.7222448267844688
|
|
|
|
key: train_roc_auc
|
|
value: [0.89077488 0.74296085 0.88906985 0.76429967 0.88809446 0.75011401
|
|
0.88276873 0.78092834 0.76929967 0.68418567]
|
|
|
|
mean value: 0.8042496131294506
|
|
|
|
key: test_jcc
|
|
value: [0.36363636 0.20754717 0.625 0.33333333 0.41666667 0.5
|
|
0.33333333 0.18181818 0.2 0.16666667]
|
|
|
|
mean value: 0.33280017152658664
|
|
|
|
key: train_jcc
|
|
value: [0.48691099 0.24205379 0.52694611 0.5046729 0.44292237 0.48113208
|
|
0.5443038 0.54285714 0.51401869 0.36633663]
|
|
|
|
mean value: 0.46521545049547125
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0385313 0.06978273 0.05149913 0.08231616 0.07145095 0.06117988
|
|
0.04736829 0.06787324 0.03546977 0.06965303]
|
|
|
|
mean value: 0.05951244831085205
|
|
|
|
key: score_time
|
|
value: [0.02941966 0.02366066 0.01590729 0.03571057 0.03261065 0.04180789
|
|
0.02399945 0.01923251 0.01207161 0.04127169]
|
|
|
|
mean value: 0.02756919860839844
|
|
|
|
key: test_mcc
|
|
value: [0.61126063 0.43754361 0.74715612 0.28152101 0.49398293 0.51178719
|
|
0.49398293 0.40070776 0.57754011 0.57478846]
|
|
|
|
mean value: 0.5130270755632105
|
|
|
|
key: train_mcc
|
|
value: [0.72265053 0.54542212 0.74972636 0.45004776 0.49882429 0.79137057
|
|
0.39848852 0.5438872 0.74247904 0.78344513]
|
|
|
|
mean value: 0.6226341527355859
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8875 0.9375 0.87341772 0.89873418 0.87341772
|
|
0.89873418 0.88607595 0.89873418 0.91139241]
|
|
|
|
mean value: 0.8965506329113924
|
|
|
|
key: train_accuracy
|
|
value: [0.914446 0.90603086 0.9312763 0.89215686 0.89915966 0.94677871
|
|
0.88515406 0.90616246 0.94257703 0.95098039]
|
|
|
|
mean value: 0.9174722343355295
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.47058824 0.7826087 0.16666667 0.42857143 0.58333333
|
|
0.42857143 0.30769231 0.63636364 0.58823529]
|
|
|
|
mean value: 0.5059297692929406
|
|
|
|
key: train_fscore
|
|
value: [0.75303644 0.4962406 0.78222222 0.384 0.44615385 0.82075472
|
|
0.30508475 0.5037037 0.76023392 0.80225989]
|
|
|
|
mean value: 0.6053690078708643
|
|
|
|
key: test_precision
|
|
value: [0.61538462 0.66666667 0.81818182 1. 1. 0.53846154
|
|
1. 1. 0.63636364 0.83333333]
|
|
|
|
mean value: 0.8108391608391609
|
|
|
|
key: train_precision
|
|
value: [0.63265306 1. 0.6984127 0.96 0.96666667 0.77678571
|
|
1. 0.97142857 0.91549296 0.92207792]
|
|
|
|
mean value: 0.8843517591842541
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.36363636 0.75 0.09090909 0.27272727 0.63636364
|
|
0.27272727 0.18181818 0.63636364 0.45454545]
|
|
|
|
mean value: 0.43863636363636366
|
|
|
|
key: train_recall
|
|
value: [0.93 0.33 0.88888889 0.24 0.29 0.87
|
|
0.18 0.34 0.65 0.71 ]
|
|
|
|
mean value: 0.5428888888888889
|
|
|
|
key: test_roc_auc
|
|
value: [0.82740448 0.66732543 0.86029412 0.54545455 0.63636364 0.77406417
|
|
0.63636364 0.59090909 0.78877005 0.71991979]
|
|
|
|
mean value: 0.7046868945206541
|
|
|
|
key: train_roc_auc
|
|
value: [0.92095432 0.665 0.91349982 0.61918567 0.64418567 0.91464169
|
|
0.59 0.66918567 0.82011401 0.85011401]
|
|
|
|
mean value: 0.7606880852136629
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.30769231 0.64285714 0.09090909 0.27272727 0.41176471
|
|
0.27272727 0.18181818 0.46666667 0.41666667]
|
|
|
|
mean value: 0.3563829307946955
|
|
|
|
key: train_jcc
|
|
value: [0.6038961 0.33 0.64233577 0.23762376 0.28712871 0.696
|
|
0.18 0.33663366 0.61320755 0.66981132]
|
|
|
|
mean value: 0.45966368768578514
|
|
|
|
MCC on Blind test: 0.5
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.45761418 0.44829988 0.63949275 0.64037538 0.63990831 0.52243495
|
|
0.23520517 0.24666142 0.24852872 0.23637342]
|
|
|
|
mean value: 0.4314894199371338
|
|
|
|
key: score_time
|
|
value: [0.04816031 0.04698753 0.04737043 0.02180266 0.04970551 0.01631689
|
|
0.01759291 0.01567197 0.01605535 0.0212636 ]
|
|
|
|
mean value: 0.030092716217041016
|
|
|
|
key: test_mcc
|
|
value: [0.49436016 0.50761192 0.68803296 0.50667099 0.78877005 0.83459145
|
|
0.57754011 0.64658323 0.74662021 0.6166353 ]
|
|
|
|
mean value: 0.6407416395771031
|
|
|
|
key: train_mcc
|
|
value: [0.91707162 0.92292711 0.91673807 0.91746997 0.91095361 0.92294382
|
|
0.93493402 0.91673282 0.91673282 0.87397233]
|
|
|
|
mean value: 0.9150476183677066
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8875 0.925 0.88607595 0.94936709 0.96202532
|
|
0.89873418 0.92405063 0.93670886 0.91139241]
|
|
|
|
mean value: 0.9180854430379747
|
|
|
|
key: train_accuracy
|
|
value: [0.98036466 0.98176718 0.98036466 0.98039216 0.9789916 0.98179272
|
|
0.98459384 0.98039216 0.98039216 0.97058824]
|
|
|
|
mean value: 0.9799639350831496
|
|
|
|
key: test_fscore
|
|
value: [0.42857143 0.57142857 0.72727273 0.57142857 0.81818182 0.84210526
|
|
0.63636364 0.66666667 0.7826087 0.66666667]
|
|
|
|
mean value: 0.6711294045390155
|
|
|
|
key: train_fscore
|
|
value: [0.92783505 0.93264249 0.92783505 0.92857143 0.92227979 0.93264249
|
|
0.94300518 0.92631579 0.92631579 0.88888889]
|
|
|
|
mean value: 0.9256331947686998
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 0.8 0.6 0.81818182 1.
|
|
0.63636364 0.85714286 0.75 0.7 ]
|
|
|
|
mean value: 0.7761688311688312
|
|
|
|
key: train_precision
|
|
value: [0.95744681 0.96774194 0.94736842 0.94791667 0.95698925 0.96774194
|
|
0.97849462 0.97777778 0.97777778 0.94382022]
|
|
|
|
mean value: 0.9623075418440077
|
|
|
|
key: test_recall
|
|
value: [0.27272727 0.54545455 0.66666667 0.54545455 0.81818182 0.72727273
|
|
0.63636364 0.54545455 0.81818182 0.63636364]
|
|
|
|
mean value: 0.6212121212121212
|
|
|
|
key: train_recall
|
|
value: [0.9 0.9 0.90909091 0.91 0.89 0.9
|
|
0.91 0.88 0.88 0.84 ]
|
|
|
|
mean value: 0.8919090909090909
|
|
|
|
key: test_roc_auc
|
|
value: [0.63636364 0.74374177 0.81862745 0.74331551 0.89438503 0.86363636
|
|
0.78877005 0.76537433 0.88703209 0.79612299]
|
|
|
|
mean value: 0.7937369216461289
|
|
|
|
key: train_roc_auc
|
|
value: [0.94673736 0.94755302 0.95047379 0.95092834 0.94174267 0.947557
|
|
0.95337134 0.93837134 0.93837134 0.91592834]
|
|
|
|
mean value: 0.9431034526817773
|
|
|
|
key: test_jcc
|
|
value: [0.27272727 0.4 0.57142857 0.4 0.69230769 0.72727273
|
|
0.46666667 0.5 0.64285714 0.5 ]
|
|
|
|
mean value: 0.5173260073260073
|
|
|
|
key: train_jcc
|
|
value: [0.86538462 0.87378641 0.86538462 0.86666667 0.85576923 0.87378641
|
|
0.89215686 0.8627451 0.8627451 0.8 ]
|
|
|
|
mean value: 0.8618425002562639
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.20900869 0.20622039 0.2010932 0.21744394 0.25689149 0.24539351
|
|
0.24098468 0.23988628 0.22156787 0.23667002]
|
|
|
|
mean value: 0.2275160074234009
|
|
|
|
key: score_time
|
|
value: [0.04157877 0.03804398 0.04044437 0.02461791 0.0289681 0.02714062
|
|
0.02853703 0.02809358 0.02449107 0.02955127]
|
|
|
|
mean value: 0.031146669387817384
|
|
|
|
key: test_mcc
|
|
value: [0.51864618 0.66195674 0.79388419 0.72659961 0.84849067 0.77643684
|
|
0.36631016 0.72659961 0.47099187 0.66135521]
|
|
|
|
mean value: 0.6551271071223412
|
|
|
|
key: train_mcc
|
|
value: [0.95890563 0.96478211 0.97047687 0.95891444 0.95891444 0.96483326
|
|
0.95295992 0.9647899 0.97073494 0.97661989]
|
|
|
|
mean value: 0.9641931415166104
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.925 0.95 0.93670886 0.96202532 0.94936709
|
|
0.84810127 0.93670886 0.88607595 0.92405063]
|
|
|
|
mean value: 0.9218037974683544
|
|
|
|
key: train_accuracy
|
|
value: [0.99018233 0.99158485 0.99298738 0.99019608 0.99019608 0.99159664
|
|
0.98879552 0.99159664 0.9929972 0.99439776]
|
|
|
|
mean value: 0.9914530468568914
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.7 0.81818182 0.76190476 0.86956522 0.8
|
|
0.45454545 0.76190476 0.52631579 0.7 ]
|
|
|
|
mean value: 0.6947973358957341
|
|
|
|
key: train_fscore
|
|
value: [0.96373057 0.96938776 0.97409326 0.96373057 0.96373057 0.96907216
|
|
0.95918367 0.96938776 0.97435897 0.97959184]
|
|
|
|
mean value: 0.9686267133808856
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.77777778 0.9 0.8 0.83333333 0.88888889
|
|
0.45454545 0.8 0.625 0.77777778]
|
|
|
|
mean value: 0.7571608946608946
|
|
|
|
key: train_precision
|
|
value: [1. 0.98958333 1. 1. 1. 1.
|
|
0.97916667 0.98958333 1. 1. ]
|
|
|
|
mean value: 0.9958333333333333
|
|
|
|
key: test_recall
|
|
value: [0.45454545 0.63636364 0.75 0.72727273 0.90909091 0.72727273
|
|
0.45454545 0.72727273 0.45454545 0.63636364]
|
|
|
|
mean value: 0.6477272727272727
|
|
|
|
key: train_recall
|
|
value: [0.93 0.95 0.94949495 0.93 0.93 0.94
|
|
0.94 0.95 0.95 0.96 ]
|
|
|
|
mean value: 0.942949494949495
|
|
|
|
key: test_roc_auc
|
|
value: [0.71277997 0.80368906 0.86764706 0.84893048 0.93983957 0.85628342
|
|
0.68315508 0.84893048 0.7052139 0.80347594]
|
|
|
|
mean value: 0.8069944974037045
|
|
|
|
key: train_roc_auc
|
|
value: [0.965 0.97418434 0.97474747 0.965 0.965 0.97
|
|
0.96837134 0.97418567 0.975 0.98 ]
|
|
|
|
mean value: 0.9711488817319649
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.53846154 0.69230769 0.61538462 0.76923077 0.66666667
|
|
0.29411765 0.61538462 0.35714286 0.53846154]
|
|
|
|
mean value: 0.5471773324714502
|
|
|
|
key: train_jcc
|
|
value: [0.93 0.94059406 0.94949495 0.93 0.93 0.94
|
|
0.92156863 0.94059406 0.95 0.96 ]
|
|
|
|
mean value: 0.9392251695757811
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.61395884 0.5485301 0.56743336 0.51388955 0.55802751 0.559268
|
|
0.47061753 0.56458807 0.535954 0.55411768]
|
|
|
|
mean value: 0.5486384630203247
|
|
|
|
key: score_time
|
|
value: [0.04558992 0.04537392 0.04560876 0.05530596 0.0599947 0.04393911
|
|
0.04824662 0.04597187 0.0459497 0.04629207]
|
|
|
|
mean value: 0.04822726249694824
|
|
|
|
key: test_mcc
|
|
value: [ 0.40104758 0.11224603 0. 0. 0.28152101 0.16794369
|
|
-0.079909 0.28152101 -0.04554016 0.40070776]
|
|
|
|
mean value: 0.1519537908453421
|
|
|
|
key: train_mcc
|
|
value: [0.81042317 0.81042317 0.76854967 0.81045494 0.78418497 0.77753061
|
|
0.79738754 0.77083951 0.81694021 0.79080362]
|
|
|
|
mean value: 0.7937537410006497
|
|
|
|
key: test_accuracy
|
|
value: [0.8875 0.85 0.85 0.86075949 0.87341772 0.86075949
|
|
0.82278481 0.87341772 0.84810127 0.88607595]
|
|
|
|
mean value: 0.8612816455696203
|
|
|
|
key: train_accuracy
|
|
value: [0.95652174 0.95652174 0.94810659 0.95658263 0.95098039 0.94957983
|
|
0.95378151 0.94817927 0.95798319 0.95238095]
|
|
|
|
mean value: 0.9530617857241073
|
|
|
|
key: test_fscore
|
|
value: [0.30769231 0.14285714 0. 0. 0.16666667 0.15384615
|
|
0. 0.16666667 0. 0.30769231]
|
|
|
|
mean value: 0.12454212454212456
|
|
|
|
key: train_fscore
|
|
value: [0.81656805 0.81656805 0.77018634 0.81656805 0.78787879 0.7804878
|
|
0.80239521 0.77300613 0.82352941 0.79518072]
|
|
|
|
mean value: 0.7982368549378833
|
|
|
|
key: test_precision
|
|
value: [1. 0.33333333 0. 0. 1. 0.5
|
|
0. 1. 0. 1. ]
|
|
|
|
mean value: 0.48333333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.18181818 0.09090909 0. 0. 0.09090909 0.09090909
|
|
0. 0.09090909 0. 0.18181818]
|
|
|
|
mean value: 0.07272727272727272
|
|
|
|
key: train_recall
|
|
value: [0.69 0.69 0.62626263 0.69 0.65 0.64
|
|
0.67 0.63 0.7 0.66 ]
|
|
|
|
mean value: 0.6646262626262627
|
|
|
|
key: test_roc_auc
|
|
value: [0.59090909 0.53096179 0.5 0.5 0.54545455 0.5381016
|
|
0.47794118 0.54545455 0.49264706 0.59090909]
|
|
|
|
mean value: 0.5312378904130822
|
|
|
|
key: train_roc_auc
|
|
value: [0.845 0.845 0.81313131 0.845 0.825 0.82
|
|
0.835 0.815 0.85 0.83 ]
|
|
|
|
mean value: 0.8323131313131313
|
|
|
|
key: test_jcc
|
|
value: [0.18181818 0.07692308 0. 0. 0.09090909 0.08333333
|
|
0. 0.09090909 0. 0.18181818]
|
|
|
|
mean value: 0.07057109557109557
|
|
|
|
key: train_jcc
|
|
value: [0.69 0.69 0.62626263 0.69 0.65 0.64
|
|
0.67 0.63 0.7 0.66 ]
|
|
|
|
mean value: 0.6646262626262627
|
|
|
|
MCC on Blind test: -0.05
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.4557972 1.30422163 1.47492957 1.36501622 1.33721948 1.4778502
|
|
1.45930934 1.19137025 0.9904933 0.96908593]
|
|
|
|
mean value: 1.3025293111801148
|
|
|
|
key: score_time
|
|
value: [0.0134933 0.01342702 0.01551461 0.01319695 0.01367974 0.01323009
|
|
0.00934172 0.01059604 0.00958824 0.0093925 ]
|
|
|
|
mean value: 0.01214601993560791
|
|
|
|
key: test_mcc
|
|
value: [0.64666979 0.61736585 0.70588235 0.72659961 0.8307804 0.83459145
|
|
0.54627358 0.72659961 0.80762516 0.72659961]
|
|
|
|
mean value: 0.7168987408471386
|
|
|
|
key: train_mcc
|
|
value: [0.99417686 0.99417686 1. 0.98248849 0.98834113 0.98837134
|
|
0.99417818 0.98250043 0.98837134 0.99417818]
|
|
|
|
mean value: 0.990678278264288
|
|
|
|
key: test_accuracy
|
|
value: [0.925 0.9125 0.925 0.93670886 0.94936709 0.96202532
|
|
0.89873418 0.93670886 0.94936709 0.93670886]
|
|
|
|
mean value: 0.9332120253164558
|
|
|
|
key: train_accuracy
|
|
value: [0.99859748 0.99859748 1. 0.99579832 0.99719888 0.99719888
|
|
0.99859944 0.99579832 0.99719888 0.99859944]
|
|
|
|
mean value: 0.9977587107774386
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.66666667 0.75 0.76190476 0.84615385 0.84210526
|
|
0.6 0.76190476 0.83333333 0.76190476]
|
|
|
|
mean value: 0.7448973395026026
|
|
|
|
key: train_fscore
|
|
value: [0.99497487 0.99497487 1. 0.98477157 0.98989899 0.99
|
|
0.99497487 0.98492462 0.99 0.99497487]
|
|
|
|
mean value: 0.9919494684106066
|
|
|
|
key: test_precision
|
|
value: [1. 0.7 0.75 0.8 0.73333333 1.
|
|
0.66666667 0.8 0.76923077 0.8 ]
|
|
|
|
mean value: 0.801923076923077
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 0.99
|
|
1. 0.98989899 0.99 1. ]
|
|
|
|
mean value: 0.996989898989899
|
|
|
|
key: test_recall
|
|
value: [0.45454545 0.63636364 0.75 0.72727273 1. 0.72727273
|
|
0.54545455 0.72727273 0.90909091 0.72727273]
|
|
|
|
mean value: 0.7204545454545455
|
|
|
|
key: train_recall
|
|
value: [0.99 0.99 1. 0.97 0.98 0.99 0.99 0.98 0.99 0.99]
|
|
|
|
mean value: 0.987
|
|
|
|
key: test_roc_auc
|
|
value: [0.72727273 0.79644269 0.85294118 0.84893048 0.97058824 0.86363636
|
|
0.75066845 0.84893048 0.93248663 0.84893048]
|
|
|
|
mean value: 0.8440827714485004
|
|
|
|
key: train_roc_auc
|
|
value: [0.995 0.995 1. 0.985 0.99 0.99418567
|
|
0.995 0.98918567 0.99418567 0.995 ]
|
|
|
|
mean value: 0.9932557003257328
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.5 0.6 0.61538462 0.73333333 0.72727273
|
|
0.42857143 0.61538462 0.71428571 0.61538462]
|
|
|
|
mean value: 0.6004162504162505
|
|
|
|
key: train_jcc
|
|
value: [0.99 0.99 1. 0.97 0.98 0.98019802
|
|
0.99 0.97029703 0.98019802 0.99 ]
|
|
|
|
mean value: 0.984069306930693
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.09551215 0.06260157 0.07106519 0.04405999 0.12507939 0.04398108
|
|
0.07186937 0.04515743 0.07706308 0.09362435]
|
|
|
|
mean value: 0.07300136089324952
|
|
|
|
key: score_time
|
|
value: [0.02474332 0.01368976 0.02133179 0.01617622 0.02065301 0.02650046
|
|
0.01577353 0.01598001 0.02128029 0.02534318]
|
|
|
|
mean value: 0.020147156715393067
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0.02411658 -0.06726728 -0.079909 0.11138831 0.16794369
|
|
0.04562045 0.04562045 -0.06482037 -0.079909 ]
|
|
|
|
mean value: 0.010278382103322166
|
|
|
|
key: train_mcc
|
|
value: [0.13131382 0.18596753 0.13208296 0.2465605 0.09279817 0.16095705
|
|
0.13132856 0.09279817 0.13132856 0.13132856]
|
|
|
|
mean value: 0.14364638833100257
|
|
|
|
key: test_accuracy
|
|
value: [0.8625 0.8125 0.825 0.82278481 0.84810127 0.86075949
|
|
0.82278481 0.82278481 0.83544304 0.82278481]
|
|
|
|
mean value: 0.8335443037974684
|
|
|
|
key: train_accuracy
|
|
value: [0.86255259 0.86535764 0.86395512 0.8697479 0.86134454 0.86414566
|
|
0.8627451 0.86134454 0.8627451 0.8627451 ]
|
|
|
|
mean value: 0.8636683284814627
|
|
|
|
key: test_fscore
|
|
value: [0. 0.11764706 0. 0. 0.14285714 0.15384615
|
|
0.125 0.125 0. 0. ]
|
|
|
|
mean value: 0.06643503555268263
|
|
|
|
key: train_fscore
|
|
value: [0.03921569 0.07692308 0.03960396 0.13084112 0.01980198 0.05825243
|
|
0.03921569 0.01980198 0.03921569 0.03921569]
|
|
|
|
mean value: 0.05020872914929885
|
|
|
|
key: test_precision
|
|
value: [0. 0.16666667 0. 0. 0.33333333 0.5
|
|
0.2 0.2 0. 0. ]
|
|
|
|
mean value: 0.14
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0. 0.09090909 0. 0. 0.09090909 0.09090909
|
|
0.09090909 0.09090909 0. 0. ]
|
|
|
|
mean value: 0.045454545454545456
|
|
|
|
key: train_recall
|
|
value: [0.02 0.04 0.02020202 0.07 0.01 0.03
|
|
0.02 0.01 0.02 0.02 ]
|
|
|
|
mean value: 0.02602020202020202
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.50922266 0.48529412 0.47794118 0.53074866 0.5381016
|
|
0.51604278 0.51604278 0.48529412 0.47794118]
|
|
|
|
mean value: 0.5036629078508874
|
|
|
|
key: train_roc_auc
|
|
value: [0.51 0.52 0.51010101 0.535 0.505 0.515
|
|
0.51 0.505 0.51 0.51 ]
|
|
|
|
mean value: 0.513010101010101
|
|
|
|
key: test_jcc
|
|
value: [0. 0.0625 0. 0. 0.07692308 0.08333333
|
|
0.06666667 0.06666667 0. 0. ]
|
|
|
|
mean value: 0.03560897435897436
|
|
|
|
key: train_jcc
|
|
value: [0.02 0.04 0.02020202 0.07 0.01 0.03
|
|
0.02 0.01 0.02 0.02 ]
|
|
|
|
mean value: 0.02602020202020202
|
|
|
|
MCC on Blind test: -0.07
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06339288 0.05783248 0.07663059 0.05685568 0.0568552 0.05862117
|
|
0.07298279 0.05087423 0.05057597 0.05194688]
|
|
|
|
mean value: 0.05965678691864014
|
|
|
|
key: score_time
|
|
value: [0.03598714 0.02616334 0.02632928 0.0277431 0.02712369 0.03276157
|
|
0.03423834 0.03742623 0.02653933 0.03339148]
|
|
|
|
mean value: 0.030770349502563476
|
|
|
|
key: test_mcc
|
|
value: [0.49671738 0.43221037 0.79349205 0.77524841 0.71339159 0.66135521
|
|
0.66135521 0.57419245 0.47099187 0.59219173]
|
|
|
|
mean value: 0.6171146260529131
|
|
|
|
key: train_mcc
|
|
value: [0.75737775 0.75592171 0.71913751 0.70836965 0.72964765 0.74296824
|
|
0.75133237 0.70919717 0.73600183 0.73045433]
|
|
|
|
mean value: 0.7340408205100952
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.875 0.95 0.94936709 0.93670886 0.92405063
|
|
0.92405063 0.91139241 0.88607595 0.91139241]
|
|
|
|
mean value: 0.9168037974683544
|
|
|
|
key: train_accuracy
|
|
value: [0.94530154 0.94530154 0.93828892 0.93557423 0.93977591 0.94257703
|
|
0.94397759 0.93557423 0.94117647 0.93977591]
|
|
|
|
mean value: 0.9407323378159118
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.5 0.8 0.77777778 0.73684211 0.7
|
|
0.7 0.53333333 0.52631579 0.63157895]
|
|
|
|
mean value: 0.6405847953216375
|
|
|
|
key: train_fscore
|
|
value: [0.77966102 0.77192982 0.73809524 0.72941176 0.75144509 0.76300578
|
|
0.7752809 0.73255814 0.75581395 0.75428571]
|
|
|
|
mean value: 0.7551487417549074
|
|
|
|
key: test_precision
|
|
value: [0.8 0.55555556 1. 1. 0.875 0.77777778
|
|
0.77777778 1. 0.625 0.75 ]
|
|
|
|
mean value: 0.8161111111111111
|
|
|
|
key: train_precision
|
|
value: [0.8961039 0.92957746 0.89855072 0.88571429 0.89041096 0.90410959
|
|
0.88461538 0.875 0.90277778 0.88 ]
|
|
|
|
mean value: 0.8946860081582964
|
|
|
|
key: test_recall
|
|
value: [0.36363636 0.45454545 0.66666667 0.63636364 0.63636364 0.63636364
|
|
0.63636364 0.36363636 0.45454545 0.54545455]
|
|
|
|
mean value: 0.5393939393939394
|
|
|
|
key: train_recall
|
|
value: [0.69 0.66 0.62626263 0.62 0.65 0.66
|
|
0.69 0.63 0.65 0.66 ]
|
|
|
|
mean value: 0.6536262626262627
|
|
|
|
key: test_roc_auc
|
|
value: [0.67457181 0.69828722 0.83333333 0.81818182 0.81082888 0.80347594
|
|
0.80347594 0.68181818 0.7052139 0.75802139]
|
|
|
|
mean value: 0.758720840114702
|
|
|
|
key: train_roc_auc
|
|
value: [0.83847471 0.8259217 0.80743099 0.80348534 0.81848534 0.82429967
|
|
0.83767101 0.80767101 0.81929967 0.82267101]
|
|
|
|
mean value: 0.820541046038065
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.33333333 0.66666667 0.63636364 0.58333333 0.53846154
|
|
0.53846154 0.36363636 0.35714286 0.46153846]
|
|
|
|
mean value: 0.48122710622710624
|
|
|
|
key: train_jcc
|
|
value: [0.63888889 0.62857143 0.58490566 0.57407407 0.60185185 0.61682243
|
|
0.63302752 0.57798165 0.60747664 0.60550459]
|
|
|
|
mean value: 0.6069104730652053
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.55796146 0.56648779 0.42351294 0.53978658 0.51576281 0.65841341
|
|
0.52290344 0.52209496 0.5138824 0.54724836]
|
|
|
|
mean value: 0.5368054151535034
|
|
|
|
key: score_time
|
|
value: [0.04054976 0.03365135 0.02662635 0.02705789 0.02737045 0.02712655
|
|
0.02725124 0.02729344 0.02713633 0.02914262]
|
|
|
|
mean value: 0.029320597648620605
|
|
|
|
key: test_mcc
|
|
value: [0.49671738 0.43221037 0.79349205 0.77524841 0.71339159 0.66135521
|
|
0.66135521 0.57419245 0.47099187 0.59219173]
|
|
|
|
mean value: 0.6171146260529131
|
|
|
|
key: train_mcc
|
|
value: [0.75737775 0.75592171 0.71913751 0.70836965 0.72964765 0.74296824
|
|
0.75133237 0.70919717 0.73600183 0.73045433]
|
|
|
|
mean value: 0.7340408205100952
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.875 0.95 0.94936709 0.93670886 0.92405063
|
|
0.92405063 0.91139241 0.88607595 0.91139241]
|
|
|
|
mean value: 0.9168037974683544
|
|
|
|
key: train_accuracy
|
|
value: [0.94530154 0.94530154 0.93828892 0.93557423 0.93977591 0.94257703
|
|
0.94397759 0.93557423 0.94117647 0.93977591]
|
|
|
|
mean value: 0.9407323378159118
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.5 0.8 0.77777778 0.73684211 0.7
|
|
0.7 0.53333333 0.52631579 0.63157895]
|
|
|
|
mean value: 0.6405847953216375
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:115: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:118: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.77966102 0.77192982 0.73809524 0.72941176 0.75144509 0.76300578
|
|
0.7752809 0.73255814 0.75581395 0.75428571]
|
|
|
|
mean value: 0.7551487417549074
|
|
|
|
key: test_precision
|
|
value: [0.8 0.55555556 1. 1. 0.875 0.77777778
|
|
0.77777778 1. 0.625 0.75 ]
|
|
|
|
mean value: 0.8161111111111111
|
|
|
|
key: train_precision
|
|
value: [0.8961039 0.92957746 0.89855072 0.88571429 0.89041096 0.90410959
|
|
0.88461538 0.875 0.90277778 0.88 ]
|
|
|
|
mean value: 0.8946860081582964
|
|
|
|
key: test_recall
|
|
value: [0.36363636 0.45454545 0.66666667 0.63636364 0.63636364 0.63636364
|
|
0.63636364 0.36363636 0.45454545 0.54545455]
|
|
|
|
mean value: 0.5393939393939394
|
|
|
|
key: train_recall
|
|
value: [0.69 0.66 0.62626263 0.62 0.65 0.66
|
|
0.69 0.63 0.65 0.66 ]
|
|
|
|
mean value: 0.6536262626262627
|
|
|
|
key: test_roc_auc
|
|
value: [0.67457181 0.69828722 0.83333333 0.81818182 0.81082888 0.80347594
|
|
0.80347594 0.68181818 0.7052139 0.75802139]
|
|
|
|
mean value: 0.758720840114702
|
|
|
|
key: train_roc_auc
|
|
value: [0.83847471 0.8259217 0.80743099 0.80348534 0.81848534 0.82429967
|
|
0.83767101 0.80767101 0.81929967 0.82267101]
|
|
|
|
mean value: 0.820541046038065
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.33333333 0.66666667 0.63636364 0.58333333 0.53846154
|
|
0.53846154 0.36363636 0.35714286 0.46153846]
|
|
|
|
mean value: 0.48122710622710624
|
|
|
|
key: train_jcc
|
|
value: [0.63888889 0.62857143 0.58490566 0.57407407 0.60185185 0.61682243
|
|
0.63302752 0.57798165 0.60747664 0.60550459]
|
|
|
|
mean value: 0.6069104730652053
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09754944 0.10211778 0.10299373 0.10591578 0.14832354 0.09998441
|
|
0.10208988 0.12186408 0.10876179 0.10412669]
|
|
|
|
mean value: 0.10937271118164063
|
|
|
|
key: score_time
|
|
value: [0.02463031 0.02883983 0.02641749 0.02212977 0.02265787 0.02216053
|
|
0.02307367 0.03439331 0.02317548 0.02400279]
|
|
|
|
mean value: 0.02514810562133789
|
|
|
|
key: test_mcc
|
|
value: [0.85400682 0.78527876 0.88320546 0.84660737 0.94158382 0.85331034
|
|
0.7972271 0.88273483 0.8028464 0.90184995]
|
|
|
|
mean value: 0.8548650838325829
|
|
|
|
key: train_mcc
|
|
value: [0.88228271 0.89152742 0.87873745 0.88658274 0.87741393 0.88379172
|
|
0.88861386 0.87883096 0.8901769 0.88365546]
|
|
|
|
mean value: 0.8841613145210079
|
|
|
|
key: test_accuracy
|
|
value: [0.9270073 0.89051095 0.94160584 0.91970803 0.97058824 0.92647059
|
|
0.89705882 0.94117647 0.89705882 0.94852941]
|
|
|
|
mean value: 0.9259714469729498
|
|
|
|
key: train_accuracy
|
|
value: [0.9405053 0.94539527 0.93887531 0.94295029 0.93811075 0.94136808
|
|
0.94381107 0.93892508 0.94462541 0.94136808]
|
|
|
|
mean value: 0.9415934630424567
|
|
|
|
key: test_fscore
|
|
value: [0.92647059 0.8951049 0.94202899 0.92517007 0.97101449 0.92753623
|
|
0.90140845 0.94029851 0.90410959 0.95104895]
|
|
|
|
mean value: 0.9284190759769286
|
|
|
|
key: train_fscore
|
|
value: [0.94210944 0.94652833 0.94023904 0.944 0.93968254 0.9427663
|
|
0.9451074 0.94033413 0.94585987 0.94267516]
|
|
|
|
mean value: 0.9429302207466138
|
|
|
|
key: test_precision
|
|
value: [0.92647059 0.85333333 0.94202899 0.87179487 0.95714286 0.91428571
|
|
0.86486486 0.95454545 0.84615385 0.90666667]
|
|
|
|
mean value: 0.9037287182530149
|
|
|
|
key: train_precision
|
|
value: [0.91808346 0.92801252 0.91900312 0.92621664 0.91640867 0.92080745
|
|
0.92379471 0.91912908 0.92523364 0.92211838]
|
|
|
|
mean value: 0.9218807679243093
|
|
|
|
key: test_recall
|
|
value: [0.92647059 0.94117647 0.94202899 0.98550725 0.98529412 0.94117647
|
|
0.94117647 0.92647059 0.97058824 1. ]
|
|
|
|
mean value: 0.9559889173060528
|
|
|
|
key: train_recall
|
|
value: [0.96742671 0.96579805 0.96247961 0.96247961 0.96416938 0.96579805
|
|
0.96742671 0.96254072 0.96742671 0.96416938]
|
|
|
|
mean value: 0.9649714917291475
|
|
|
|
key: test_roc_auc
|
|
value: [0.92700341 0.89087809 0.94160273 0.91922421 0.97058824 0.92647059
|
|
0.89705882 0.94117647 0.89705882 0.94852941]
|
|
|
|
mean value: 0.9259590792838874
|
|
|
|
key: train_roc_auc
|
|
value: [0.94048334 0.94537863 0.93889453 0.94296619 0.93811075 0.94136808
|
|
0.94381107 0.93892508 0.94462541 0.94136808]
|
|
|
|
mean value: 0.9415931155049923
|
|
|
|
key: test_jcc
|
|
value: [0.8630137 0.81012658 0.89041096 0.86075949 0.94366197 0.86486486
|
|
0.82051282 0.88732394 0.825 0.90666667]
|
|
|
|
mean value: 0.8672341001020923
|
|
|
|
key: train_jcc
|
|
value: [0.89055472 0.89848485 0.88721805 0.89393939 0.88622754 0.89172932
|
|
0.8959276 0.88738739 0.89728097 0.89156627]
|
|
|
|
mean value: 0.892031609941911
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.06157088 2.21632552 2.12772703 2.1870575 2.24508309 2.5032568
|
|
2.55119944 2.44441295 2.43192077 1.92357469]
|
|
|
|
mean value: 2.2692128658294677
|
|
|
|
key: score_time
|
|
value: [0.02278161 0.034199 0.02382827 0.03283787 0.02307153 0.02474856
|
|
0.02306676 0.02312875 0.02295542 0.01745105]
|
|
|
|
mean value: 0.024806880950927736
|
|
|
|
key: test_mcc
|
|
value: [0.98550725 0.81460896 0.91281179 0.88654289 0.91334626 0.86774089
|
|
0.81600218 0.94158382 0.84271225 0.89949371]
|
|
|
|
mean value: 0.8880350000215755
|
|
|
|
key: train_mcc
|
|
value: [0.93972107 0.94462498 0.92685969 0.93171963 0.93337455 0.94314811
|
|
0.94961342 0.92520415 0.93337455 0.93171967]
|
|
|
|
mean value: 0.9359359827608927
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.90510949 0.95620438 0.94160584 0.95588235 0.93382353
|
|
0.90441176 0.97058824 0.91911765 0.94852941]
|
|
|
|
mean value: 0.9427973379132675
|
|
|
|
key: train_accuracy
|
|
value: [0.96984515 0.97229014 0.96332518 0.96577017 0.96661238 0.97149837
|
|
0.9747557 0.96254072 0.96661238 0.96579805]
|
|
|
|
mean value: 0.9679048233423327
|
|
|
|
key: test_fscore
|
|
value: [0.99270073 0.90909091 0.95588235 0.94444444 0.95714286 0.93430657
|
|
0.91034483 0.97014925 0.92307692 0.95035461]
|
|
|
|
mean value: 0.9447493477213011
|
|
|
|
key: train_fscore
|
|
value: [0.96999189 0.97244733 0.96368039 0.96607431 0.9669088 0.97175141
|
|
0.97493937 0.9628433 0.9669088 0.96607431]
|
|
|
|
mean value: 0.9681619902040669
|
|
|
|
key: test_precision
|
|
value: [0.98550725 0.86666667 0.97014925 0.90666667 0.93055556 0.92753623
|
|
0.85714286 0.98484848 0.88 0.91780822]
|
|
|
|
mean value: 0.9226881182050526
|
|
|
|
key: train_precision
|
|
value: [0.96607431 0.96774194 0.95367412 0.9568 0.9584 0.9632
|
|
0.96789727 0.95512821 0.9584 0.95833333]
|
|
|
|
mean value: 0.9605649180027942
|
|
|
|
key: test_recall
|
|
value: [1. 0.95588235 0.94202899 0.98550725 0.98529412 0.94117647
|
|
0.97058824 0.95588235 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9692242114237
|
|
|
|
key: train_recall
|
|
value: [0.97394137 0.9771987 0.97389886 0.97553018 0.97557003 0.98045603
|
|
0.98208469 0.97068404 0.97557003 0.97394137]
|
|
|
|
mean value: 0.9758875291592053
|
|
|
|
key: test_roc_auc
|
|
value: [0.99275362 0.90547741 0.95630861 0.94128303 0.95588235 0.93382353
|
|
0.90441176 0.97058824 0.91911765 0.94852941]
|
|
|
|
mean value: 0.9428175618073317
|
|
|
|
key: train_roc_auc
|
|
value: [0.96984181 0.97228613 0.96333379 0.96577812 0.96661238 0.97149837
|
|
0.9747557 0.96254072 0.96661238 0.96579805]
|
|
|
|
mean value: 0.9679057446955486
|
|
|
|
key: test_jcc
|
|
value: [0.98550725 0.83333333 0.91549296 0.89473684 0.91780822 0.87671233
|
|
0.83544304 0.94202899 0.85714286 0.90540541]
|
|
|
|
mean value: 0.8963611213537285
|
|
|
|
key: train_jcc
|
|
value: [0.94173228 0.94637224 0.92990654 0.934375 0.9359375 0.94505495
|
|
0.9511041 0.92834891 0.9359375 0.934375 ]
|
|
|
|
mean value: 0.9383144020926913
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02025676 0.01463985 0.01642394 0.01580334 0.01426625 0.01391959
|
|
0.0137136 0.01319313 0.0138135 0.01418281]
|
|
|
|
mean value: 0.015021276473999024
|
|
|
|
key: score_time
|
|
value: [0.01330924 0.01044321 0.01135588 0.01076484 0.0106945 0.01014638
|
|
0.00978804 0.00987625 0.00961947 0.01057482]
|
|
|
|
mean value: 0.010657262802124024
|
|
|
|
key: test_mcc
|
|
value: [0.64981886 0.63862773 0.78169078 0.75191816 0.67676337 0.7540057
|
|
0.64423542 0.82388584 0.63573029 0.67647059]
|
|
|
|
mean value: 0.703314674416271
|
|
|
|
key: train_mcc
|
|
value: [0.71960479 0.72838719 0.710723 0.73443309 0.7079266 0.71094685
|
|
0.73513731 0.70105507 0.72545745 0.70273867]
|
|
|
|
mean value: 0.7176410015716607
|
|
|
|
key: test_accuracy
|
|
value: [0.82481752 0.81751825 0.89051095 0.87591241 0.83823529 0.875
|
|
0.81617647 0.91176471 0.81617647 0.83823529]
|
|
|
|
mean value: 0.8504347359381709
|
|
|
|
key: train_accuracy
|
|
value: [0.8590057 0.86389568 0.85493073 0.86715566 0.8534202 0.85504886
|
|
0.86726384 0.85016287 0.86237785 0.8509772 ]
|
|
|
|
mean value: 0.8584238589393373
|
|
|
|
key: test_fscore
|
|
value: [0.82089552 0.82517483 0.89361702 0.87591241 0.8358209 0.88111888
|
|
0.83221477 0.91044776 0.82517483 0.83823529]
|
|
|
|
mean value: 0.8538612199827047
|
|
|
|
key: train_fscore
|
|
value: [0.86367218 0.86671987 0.85828025 0.86822959 0.85736926 0.85850556
|
|
0.86991221 0.85350318 0.86533865 0.85441527]
|
|
|
|
mean value: 0.8615946032444376
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.78666667 0.875 0.88235294 0.84848485 0.84
|
|
0.7654321 0.92424242 0.78666667 0.83823529]
|
|
|
|
mean value: 0.8380414273453489
|
|
|
|
key: train_precision
|
|
value: [0.83664122 0.84976526 0.83825816 0.86057692 0.83487654 0.83850932
|
|
0.85289515 0.83489097 0.84711388 0.83514774]
|
|
|
|
mean value: 0.8428675171402082
|
|
|
|
key: test_recall
|
|
value: [0.80882353 0.86764706 0.91304348 0.86956522 0.82352941 0.92647059
|
|
0.91176471 0.89705882 0.86764706 0.83823529]
|
|
|
|
mean value: 0.872378516624041
|
|
|
|
key: train_recall
|
|
value: [0.89250814 0.88436482 0.87928222 0.87601958 0.88110749 0.87947883
|
|
0.88762215 0.87296417 0.88436482 0.87459283]
|
|
|
|
mean value: 0.8812305051782497
|
|
|
|
key: test_roc_auc
|
|
value: [0.82470162 0.8178815 0.89034527 0.87595908 0.83823529 0.875
|
|
0.81617647 0.91176471 0.81617647 0.83823529]
|
|
|
|
mean value: 0.8504475703324809
|
|
|
|
key: train_roc_auc
|
|
value: [0.85897838 0.86387898 0.85495056 0.86716288 0.8534202 0.85504886
|
|
0.86726384 0.85016287 0.86237785 0.8509772 ]
|
|
|
|
mean value: 0.8584221615273844
|
|
|
|
key: test_jcc
|
|
value: [0.69620253 0.70238095 0.80769231 0.77922078 0.71794872 0.7875
|
|
0.71264368 0.83561644 0.70238095 0.72151899]
|
|
|
|
mean value: 0.7463105345128135
|
|
|
|
key: train_jcc
|
|
value: [0.76005548 0.76478873 0.75174338 0.76714286 0.75034674 0.75208914
|
|
0.76977401 0.74444444 0.76264045 0.74583333]
|
|
|
|
mean value: 0.756885855885731
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01524782 0.01856661 0.01861858 0.0185709 0.01848412 0.01891041
|
|
0.02782226 0.01901007 0.0183835 0.01848745]
|
|
|
|
mean value: 0.019210171699523926
|
|
|
|
key: score_time
|
|
value: [0.0127418 0.01292229 0.01289725 0.01281047 0.01282048 0.01274705
|
|
0.01290607 0.01298213 0.01289558 0.01290154]
|
|
|
|
mean value: 0.012862467765808105
|
|
|
|
key: test_mcc
|
|
value: [0.70934757 0.64091263 0.68322489 0.7614264 0.73817324 0.72066617
|
|
0.55785938 0.82352941 0.64423542 0.76000982]
|
|
|
|
mean value: 0.7039384936898134
|
|
|
|
key: train_mcc
|
|
value: [0.7243465 0.71902729 0.71154553 0.70652812 0.69918792 0.71142953
|
|
0.71981239 0.70040764 0.71095972 0.69961749]
|
|
|
|
mean value: 0.7102862122707756
|
|
|
|
key: test_accuracy
|
|
value: [0.8540146 0.81751825 0.83941606 0.87591241 0.86764706 0.86029412
|
|
0.77205882 0.91176471 0.81617647 0.875 ]
|
|
|
|
mean value: 0.8489802490339201
|
|
|
|
key: train_accuracy
|
|
value: [0.8598207 0.85737571 0.85330073 0.85167074 0.84690554 0.85260586
|
|
0.85749186 0.84771987 0.8534202 0.84771987]
|
|
|
|
mean value: 0.8528031081342965
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.82758621 0.84931507 0.88590604 0.87323944 0.86131387
|
|
0.79470199 0.91176471 0.83221477 0.88435374]
|
|
|
|
mean value: 0.8577538677268463
|
|
|
|
key: train_fscore
|
|
value: [0.86748844 0.86486486 0.86132512 0.85825545 0.85582822 0.86172651
|
|
0.86528099 0.85626441 0.86089645 0.85559846]
|
|
|
|
mean value: 0.8607528903638494
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.77922078 0.80519481 0.825 0.83783784 0.85507246
|
|
0.72289157 0.91176471 0.7654321 0.82278481]
|
|
|
|
mean value: 0.8158532400394299
|
|
|
|
key: train_precision
|
|
value: [0.82309942 0.82232012 0.81605839 0.82116244 0.80869565 0.81151079
|
|
0.82043796 0.81077147 0.81911765 0.81350954]
|
|
|
|
mean value: 0.8166683432704045
|
|
|
|
key: test_recall
|
|
value: [0.88235294 0.88235294 0.89855072 0.95652174 0.91176471 0.86764706
|
|
0.88235294 0.91176471 0.91176471 0.95588235]
|
|
|
|
mean value: 0.9060954816709292
|
|
|
|
key: train_recall
|
|
value: [0.91693811 0.91205212 0.91190865 0.89885808 0.90879479 0.91856678
|
|
0.91530945 0.90716612 0.90716612 0.90228013]
|
|
|
|
mean value: 0.9099040336679225
|
|
|
|
key: test_roc_auc
|
|
value: [0.85421995 0.81798806 0.83898124 0.87531969 0.86764706 0.86029412
|
|
0.77205882 0.91176471 0.81617647 0.875 ]
|
|
|
|
mean value: 0.8489450127877238
|
|
|
|
key: train_roc_auc
|
|
value: [0.85977411 0.85733112 0.85334846 0.85170917 0.84690554 0.85260586
|
|
0.85749186 0.84771987 0.8534202 0.84771987]
|
|
|
|
mean value: 0.8528026048004421
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.70588235 0.73809524 0.79518072 0.775 0.75641026
|
|
0.65934066 0.83783784 0.71264368 0.79268293]
|
|
|
|
mean value: 0.7523073672506922
|
|
|
|
key: train_jcc
|
|
value: [0.76598639 0.76190476 0.7564276 0.75170532 0.74798928 0.75704698
|
|
0.76255088 0.74865591 0.75576662 0.74763833]
|
|
|
|
mean value: 0.7555672081895808
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01776791 0.0143683 0.01367235 0.01267362 0.01240182 0.01215315
|
|
0.01259208 0.01372862 0.01309514 0.0132134 ]
|
|
|
|
mean value: 0.01356663703918457
|
|
|
|
key: score_time
|
|
value: [0.04105425 0.02325535 0.02105951 0.02271795 0.02249193 0.02242374
|
|
0.02348661 0.02194905 0.02194166 0.02291274]
|
|
|
|
mean value: 0.024329280853271483
|
|
|
|
key: test_mcc
|
|
value: [0.75261265 0.76196863 0.88320546 0.70218993 0.88852332 0.77311134
|
|
0.75203572 0.76603235 0.7004012 0.88273483]
|
|
|
|
mean value: 0.7862815435156082
|
|
|
|
key: train_mcc
|
|
value: [0.85158414 0.85591536 0.84591114 0.84657754 0.84562892 0.85043233
|
|
0.84449734 0.86388679 0.85299584 0.83909896]
|
|
|
|
mean value: 0.8496528341555593
|
|
|
|
key: test_accuracy
|
|
value: [0.87591241 0.87591241 0.94160584 0.84671533 0.94117647 0.88235294
|
|
0.875 0.88235294 0.84558824 0.94117647]
|
|
|
|
mean value: 0.890779304422499
|
|
|
|
key: train_accuracy
|
|
value: [0.92420538 0.92665037 0.92176039 0.92176039 0.9218241 0.9242671
|
|
0.92100977 0.93078176 0.92508143 0.91856678]
|
|
|
|
mean value: 0.9235907472742766
|
|
|
|
key: test_fscore
|
|
value: [0.87769784 0.88435374 0.94202899 0.8590604 0.94444444 0.89041096
|
|
0.87943262 0.88571429 0.85714286 0.94202899]
|
|
|
|
mean value: 0.8962315127241446
|
|
|
|
key: train_fscore
|
|
value: [0.92740047 0.92946708 0.9245283 0.92488263 0.92440945 0.92671395
|
|
0.92392157 0.93322859 0.92801252 0.92125984]
|
|
|
|
mean value: 0.9263824405409482
|
|
|
|
key: test_precision
|
|
value: [0.85915493 0.82278481 0.94202899 0.8 0.89473684 0.83333333
|
|
0.84931507 0.86111111 0.79746835 0.92857143]
|
|
|
|
mean value: 0.858850486325596
|
|
|
|
key: train_precision
|
|
value: [0.89055472 0.89577039 0.892261 0.8887218 0.89481707 0.89770992
|
|
0.89107413 0.90136571 0.89307229 0.89176829]
|
|
|
|
mean value: 0.893711533581153
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.95588235 0.94202899 0.92753623 1. 0.95588235
|
|
0.91176471 0.91176471 0.92647059 0.95588235]
|
|
|
|
mean value: 0.9384271099744246
|
|
|
|
key: train_recall
|
|
value: [0.96742671 0.96579805 0.95921697 0.96411093 0.95602606 0.95765472
|
|
0.95928339 0.96742671 0.96579805 0.95276873]
|
|
|
|
mean value: 0.9615510306018885
|
|
|
|
key: test_roc_auc
|
|
value: [0.87606564 0.8764919 0.94160273 0.84612106 0.94117647 0.88235294
|
|
0.875 0.88235294 0.84558824 0.94117647]
|
|
|
|
mean value: 0.8907928388746803
|
|
|
|
key: train_roc_auc
|
|
value: [0.92417013 0.92661844 0.92179089 0.92179488 0.9218241 0.9242671
|
|
0.92100977 0.93078176 0.92508143 0.91856678]
|
|
|
|
mean value: 0.9235905277085513
|
|
|
|
key: test_jcc
|
|
value: [0.78205128 0.79268293 0.89041096 0.75294118 0.89473684 0.80246914
|
|
0.78481013 0.79487179 0.75 0.89041096]
|
|
|
|
mean value: 0.8135385202521164
|
|
|
|
key: train_jcc
|
|
value: [0.86462882 0.8682284 0.85964912 0.86026201 0.85944363 0.86343612
|
|
0.85860058 0.87481591 0.86569343 0.8540146 ]
|
|
|
|
mean value: 0.8628772629019651
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06691337 0.06051254 0.07731438 0.07657027 0.07596445 0.07846117
|
|
0.07570052 0.07647061 0.07562709 0.07878208]
|
|
|
|
mean value: 0.07423164844512939
|
|
|
|
key: score_time
|
|
value: [0.02085137 0.02084017 0.02438045 0.02414894 0.02422881 0.02476311
|
|
0.02408767 0.02523947 0.02440548 0.02516818]
|
|
|
|
mean value: 0.02381136417388916
|
|
|
|
key: test_mcc
|
|
value: [0.78111679 0.78527876 0.86948194 0.86311873 0.88273483 0.82675403
|
|
0.78017138 0.85331034 0.75665657 0.88580789]
|
|
|
|
mean value: 0.8284431268832849
|
|
|
|
key: train_mcc
|
|
value: [0.85879004 0.86071236 0.85462057 0.85617638 0.85007907 0.86004287
|
|
0.86436688 0.86022618 0.87464084 0.85473165]
|
|
|
|
mean value: 0.8594386838908423
|
|
|
|
key: test_accuracy
|
|
value: [0.89051095 0.89051095 0.93430657 0.9270073 0.94117647 0.91176471
|
|
0.88970588 0.92647059 0.875 0.94117647]
|
|
|
|
mean value: 0.9127629884070416
|
|
|
|
key: train_accuracy
|
|
value: [0.92909535 0.92991035 0.92665037 0.92746536 0.9242671 0.92915309
|
|
0.93159609 0.92915309 0.93648208 0.9267101 ]
|
|
|
|
mean value: 0.9290482997910743
|
|
|
|
key: test_fscore
|
|
value: [0.89051095 0.8951049 0.93333333 0.93243243 0.94202899 0.91549296
|
|
0.89208633 0.92537313 0.88275862 0.94366197]
|
|
|
|
mean value: 0.9152783610813747
|
|
|
|
key: train_fscore
|
|
value: [0.93045564 0.93152866 0.92857143 0.92930898 0.92648221 0.93133386
|
|
0.93333333 0.93144208 0.93838863 0.92868463]
|
|
|
|
mean value: 0.930952944168937
|
|
|
|
key: test_precision
|
|
value: [0.88405797 0.85333333 0.95454545 0.87341772 0.92857143 0.87837838
|
|
0.87323944 0.93939394 0.83116883 0.90540541]
|
|
|
|
mean value: 0.892151189994997
|
|
|
|
key: train_precision
|
|
value: [0.91365777 0.91121495 0.90417311 0.90557276 0.90015361 0.90352221
|
|
0.91021672 0.90229008 0.91104294 0.90432099]
|
|
|
|
mean value: 0.9066165128215168
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.94117647 0.91304348 1. 0.95588235 0.95588235
|
|
0.91176471 0.91176471 0.94117647 0.98529412]
|
|
|
|
mean value: 0.941304347826087
|
|
|
|
key: train_recall
|
|
value: [0.94788274 0.95276873 0.954323 0.954323 0.95439739 0.96091205
|
|
0.95765472 0.96254072 0.96742671 0.95439739]
|
|
|
|
mean value: 0.9566626459288701
|
|
|
|
key: test_roc_auc
|
|
value: [0.8905584 0.89087809 0.93446292 0.92647059 0.94117647 0.91176471
|
|
0.88970588 0.92647059 0.875 0.94117647]
|
|
|
|
mean value: 0.9127664109121909
|
|
|
|
key: train_roc_auc
|
|
value: [0.92908003 0.92989171 0.9266729 0.92748723 0.9242671 0.92915309
|
|
0.93159609 0.92915309 0.93648208 0.9267101 ]
|
|
|
|
mean value: 0.929049343486139
|
|
|
|
key: test_jcc
|
|
value: [0.80263158 0.81012658 0.875 0.87341772 0.89041096 0.84415584
|
|
0.80519481 0.86111111 0.79012346 0.89333333]
|
|
|
|
mean value: 0.8445505392234164
|
|
|
|
key: train_jcc
|
|
value: [0.86995516 0.87183308 0.86666667 0.86795252 0.86303387 0.87149188
|
|
0.875 0.87168142 0.88392857 0.86686391]
|
|
|
|
mean value: 0.8708407072769933
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [3.54420257 2.05947042 4.31787467 2.78397083 1.90792727 4.28846169
|
|
4.24948835 3.81972098 3.66617012 3.65981555]
|
|
|
|
mean value: 3.4297102451324464
|
|
|
|
key: score_time
|
|
value: [0.0131371 0.01322865 0.01350379 0.01714802 0.0131588 0.01303959
|
|
0.01309347 0.0130167 0.01305389 0.0130465 ]
|
|
|
|
mean value: 0.013542652130126953
|
|
|
|
key: test_mcc
|
|
value: [0.9001543 0.8110473 0.92710997 0.85977656 0.91334626 0.81101892
|
|
0.88273483 0.89715584 0.78981412 0.91334626]
|
|
|
|
mean value: 0.8705504359250393
|
|
|
|
key: train_mcc
|
|
value: [0.94182238 0.9315403 0.96577139 0.93394821 0.89971038 0.9691595
|
|
0.9257257 0.93051831 0.95464559 0.95114511]
|
|
|
|
mean value: 0.9403986857141324
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.90510949 0.96350365 0.9270073 0.95588235 0.90441176
|
|
0.94117647 0.94852941 0.88970588 0.95588235]
|
|
|
|
mean value: 0.9340113782739373
|
|
|
|
key: train_accuracy
|
|
value: [0.97066015 0.96577017 0.98288509 0.96658517 0.9495114 0.98452769
|
|
0.96172638 0.96416938 0.9771987 0.97557003]
|
|
|
|
mean value: 0.9698604153559036
|
|
|
|
key: test_fscore
|
|
value: [0.94656489 0.90647482 0.96350365 0.93150685 0.95714286 0.90780142
|
|
0.94029851 0.94890511 0.89795918 0.95714286]
|
|
|
|
mean value: 0.935730013794081
|
|
|
|
key: train_fscore
|
|
value: [0.97019868 0.96579805 0.98285714 0.96722622 0.95047923 0.98464026
|
|
0.96033755 0.96535433 0.97745572 0.97553018]
|
|
|
|
mean value: 0.9699877354381214
|
|
|
|
key: test_precision
|
|
value: [0.98412698 0.88732394 0.97058824 0.88311688 0.93055556 0.87671233
|
|
0.95454545 0.94202899 0.83544304 0.93055556]
|
|
|
|
mean value: 0.9194996964105575
|
|
|
|
key: train_precision
|
|
value: [0.98653199 0.96579805 0.98366013 0.94827586 0.93260188 0.97752809
|
|
0.99649737 0.93445122 0.96656051 0.97712418]
|
|
|
|
mean value: 0.9669029280790539
|
|
|
|
key: test_recall
|
|
value: [0.91176471 0.92647059 0.95652174 0.98550725 0.98529412 0.94117647
|
|
0.92647059 0.95588235 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9544970161977835
|
|
|
|
key: train_recall
|
|
value: [0.95439739 0.96579805 0.98205546 0.98694943 0.96905537 0.99185668
|
|
0.9267101 0.99837134 0.98859935 0.97394137]
|
|
|
|
mean value: 0.9737734535657921
|
|
|
|
key: test_roc_auc
|
|
value: [0.94863598 0.90526428 0.96355499 0.92657715 0.95588235 0.90441176
|
|
0.94117647 0.94852941 0.88970588 0.95588235]
|
|
|
|
mean value: 0.9339620630861041
|
|
|
|
key: train_roc_auc
|
|
value: [0.97067341 0.96577015 0.98288441 0.96660175 0.9495114 0.98452769
|
|
0.96172638 0.96416938 0.9771987 0.97557003]
|
|
|
|
mean value: 0.9698633303399206
|
|
|
|
key: test_jcc
|
|
value: [0.89855072 0.82894737 0.92957746 0.87179487 0.91780822 0.83116883
|
|
0.88732394 0.90277778 0.81481481 0.91780822]
|
|
|
|
mean value: 0.8800572235421898
|
|
|
|
key: train_jcc
|
|
value: [0.94212219 0.93385827 0.96629213 0.93653251 0.90563166 0.96974522
|
|
0.9237013 0.93302892 0.95590551 0.9522293 ]
|
|
|
|
mean value: 0.9419047007975033
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15721059 0.12421274 0.11584687 0.1442461 0.16757679 0.12364697
|
|
0.1610961 0.1497941 0.11538434 0.16368508]
|
|
|
|
mean value: 0.14226996898651123
|
|
|
|
key: score_time
|
|
value: [0.0095911 0.01425385 0.01022911 0.00966549 0.01003146 0.00989795
|
|
0.01006746 0.00960088 0.00994182 0.01044965]
|
|
|
|
mean value: 0.01037287712097168
|
|
|
|
key: test_mcc
|
|
value: [0.8978896 0.86948194 0.85434012 0.8251228 0.83832595 0.82388584
|
|
0.75008111 0.91176471 0.86849267 0.94158382]
|
|
|
|
mean value: 0.858096855393985
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.93430657 0.9270073 0.91240876 0.91911765 0.91176471
|
|
0.875 0.95588235 0.93382353 0.97058824]
|
|
|
|
mean value: 0.9288804207814513
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94814815 0.9352518 0.92857143 0.91428571 0.91970803 0.91044776
|
|
0.87407407 0.95588235 0.9352518 0.97101449]
|
|
|
|
mean value: 0.9292635598287577
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95522388 0.91549296 0.91549296 0.90140845 0.91304348 0.92424242
|
|
0.88059701 0.95588235 0.91549296 0.95714286]
|
|
|
|
mean value: 0.9234019332053377
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.95588235 0.94202899 0.92753623 0.92647059 0.89705882
|
|
0.86764706 0.95588235 0.95588235 0.98529412]
|
|
|
|
mean value: 0.9354859335038364
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9488491 0.93446292 0.92689685 0.91229753 0.91911765 0.91176471
|
|
0.875 0.95588235 0.93382353 0.97058824]
|
|
|
|
mean value: 0.9288682864450128
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90140845 0.87837838 0.86666667 0.84210526 0.85135135 0.83561644
|
|
0.77631579 0.91549296 0.87837838 0.94366197]
|
|
|
|
mean value: 0.8689375646044208
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.20700741 0.2227931 0.21003532 0.20532703 0.20421553 0.22666597
|
|
0.20230484 0.22685504 0.20340753 0.21781135]
|
|
|
|
mean value: 0.21264231204986572
|
|
|
|
key: score_time
|
|
value: [0.02035546 0.02207088 0.02063417 0.02015138 0.02062988 0.02127051
|
|
0.02150893 0.02174425 0.02023482 0.02667117]
|
|
|
|
mean value: 0.02152714729309082
|
|
|
|
key: test_mcc
|
|
value: [0.92944673 0.86948194 0.92951942 0.92791659 0.95598573 0.91215932
|
|
0.86849267 0.91334626 0.83905224 0.94158382]
|
|
|
|
mean value: 0.9086984732737446
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96350365 0.93430657 0.96350365 0.96350365 0.97794118 0.95588235
|
|
0.93382353 0.95588235 0.91911765 0.97058824]
|
|
|
|
mean value: 0.9538052812365823
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96183206 0.9352518 0.96240602 0.96296296 0.97810219 0.95652174
|
|
0.9352518 0.95454545 0.92086331 0.97014925]
|
|
|
|
mean value: 0.9537886582732334
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.91549296 1. 0.98484848 0.97101449 0.94285714
|
|
0.91549296 0.984375 0.90140845 0.98484848]
|
|
|
|
mean value: 0.9600337971504919
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92647059 0.95588235 0.92753623 0.94202899 0.98529412 0.97058824
|
|
0.95588235 0.92647059 0.94117647 0.95588235]
|
|
|
|
mean value: 0.9487212276214834
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96323529 0.93446292 0.96376812 0.96366155 0.97794118 0.95588235
|
|
0.93382353 0.95588235 0.91911765 0.97058824]
|
|
|
|
mean value: 0.95383631713555
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92647059 0.87837838 0.92753623 0.92857143 0.95714286 0.91666667
|
|
0.87837838 0.91304348 0.85333333 0.94202899]
|
|
|
|
mean value: 0.9121550326358511
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02630854 0.02538991 0.02353644 0.02398157 0.02583075 0.02810669
|
|
0.03100824 0.02225041 0.02205133 0.02273846]
|
|
|
|
mean value: 0.025120234489440917
|
|
|
|
key: score_time
|
|
value: [0.01718497 0.01694894 0.01576543 0.01617956 0.0177052 0.01829982
|
|
0.01477838 0.01297951 0.01446557 0.01420522]
|
|
|
|
mean value: 0.015851259231567383
|
|
|
|
key: test_mcc
|
|
value: [0.78111679 0.66616982 0.72918846 0.81031543 0.82495791 0.73656956
|
|
0.57408838 0.82352941 0.76470588 0.808911 ]
|
|
|
|
mean value: 0.7519552661564883
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89051095 0.83211679 0.86131387 0.90510949 0.91176471 0.86764706
|
|
0.78676471 0.91176471 0.88235294 0.90441176]
|
|
|
|
mean value: 0.8753756977243452
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.89051095 0.83687943 0.85271318 0.90510949 0.91428571 0.87142857
|
|
0.78195489 0.91176471 0.88235294 0.90510949]
|
|
|
|
mean value: 0.875210935791714
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88405797 0.80821918 0.91666667 0.91176471 0.88888889 0.84722222
|
|
0.8 0.91176471 0.88235294 0.89855072]
|
|
|
|
mean value: 0.874948800445332
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.86764706 0.79710145 0.89855072 0.94117647 0.89705882
|
|
0.76470588 0.91176471 0.88235294 0.91176471]
|
|
|
|
mean value: 0.8769181585677749
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8905584 0.83237425 0.86178602 0.90515772 0.91176471 0.86764706
|
|
0.78676471 0.91176471 0.88235294 0.90441176]
|
|
|
|
mean value: 0.8754582267689685
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.80263158 0.7195122 0.74324324 0.82666667 0.84210526 0.7721519
|
|
0.64197531 0.83783784 0.78947368 0.82666667]
|
|
|
|
mean value: 0.7802264343228308
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [4.85859299 4.71020794 4.71203661 5.00940704 4.99922633 5.03658009
|
|
4.98546529 5.07385755 4.9456017 5.48568153]
|
|
|
|
mean value: 4.981665706634521
|
|
|
|
key: score_time
|
|
value: [0.12868524 0.10628939 0.11306834 0.11914802 0.11893177 0.11918879
|
|
0.11937928 0.11875987 0.11915421 0.10896921]
|
|
|
|
mean value: 0.11715741157531738
|
|
|
|
key: test_mcc
|
|
value: [0.92787101 0.8978896 0.94201665 0.97080136 0.94158382 0.92657079
|
|
0.92657079 0.97100831 0.88580789 0.98540068]
|
|
|
|
mean value: 0.9375520891751091
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96350365 0.94890511 0.97080292 0.98540146 0.97058824 0.96323529
|
|
0.96323529 0.98529412 0.94117647 0.99264706]
|
|
|
|
mean value: 0.9684789609274367
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96240602 0.94814815 0.97058824 0.98550725 0.97101449 0.96296296
|
|
0.96350365 0.98507463 0.94366197 0.99270073]
|
|
|
|
mean value: 0.9685568078831959
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.98461538 0.95522388 0.98507463 0.98550725 0.95714286 0.97014925
|
|
0.95652174 1. 0.90540541 0.98550725]
|
|
|
|
mean value: 0.9685147640241735
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.94117647 0.95652174 0.98550725 0.98529412 0.95588235
|
|
0.97058824 0.97058824 0.98529412 1. ]
|
|
|
|
mean value: 0.9692028985507246
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96334186 0.9488491 0.97090793 0.98540068 0.97058824 0.96323529
|
|
0.96323529 0.98529412 0.94117647 0.99264706]
|
|
|
|
mean value: 0.9684676044330777
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92753623 0.90140845 0.94285714 0.97142857 0.94366197 0.92857143
|
|
0.92957746 0.97058824 0.89333333 0.98550725]
|
|
|
|
mean value: 0.9394470077069407
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.55692935 3.20952964 3.28124666 1.61803484 1.44277549 1.39598799
|
|
1.42690277 1.50912189 1.46272874 1.47186947]
|
|
|
|
mean value: 1.8375126838684082
|
|
|
|
key: score_time
|
|
value: [0.18893909 0.23083615 0.22105742 0.13914585 0.21864581 0.16200542
|
|
0.13056111 0.18395448 0.22659445 0.13184023]
|
|
|
|
mean value: 0.18335800170898436
|
|
|
|
key: test_mcc
|
|
value: [0.95629932 0.91240409 0.95630861 0.94160273 0.91176471 0.92657079
|
|
0.91215932 0.95598573 0.87000211 0.95598573]
|
|
|
|
mean value: 0.9299083136889414
|
|
|
|
key: train_mcc
|
|
value: [0.97392522 0.97555137 0.97556187 0.97392011 0.97070464 0.97232431
|
|
0.97557133 0.97070464 0.97882866 0.97068919]
|
|
|
|
mean value: 0.973778132952908
|
|
|
|
key: test_accuracy
|
|
value: [0.97810219 0.95620438 0.97810219 0.97080292 0.95588235 0.96323529
|
|
0.95588235 0.97794118 0.93382353 0.97794118]
|
|
|
|
mean value: 0.9647917561185058
|
|
|
|
key: train_accuracy
|
|
value: [0.98696007 0.98777506 0.98777506 0.98696007 0.98534202 0.98615635
|
|
0.98778502 0.98534202 0.98941368 0.98534202]
|
|
|
|
mean value: 0.9868851360140594
|
|
|
|
key: test_fscore
|
|
value: [0.97777778 0.95588235 0.97810219 0.97101449 0.95588235 0.96296296
|
|
0.95652174 0.97777778 0.93617021 0.97810219]
|
|
|
|
mean value: 0.9650194048612931
|
|
|
|
key: train_fscore
|
|
value: [0.98699187 0.98779496 0.98779496 0.98694943 0.98538961 0.98619009
|
|
0.98779496 0.98538961 0.98940505 0.98536585]
|
|
|
|
mean value: 0.9869066381471465
|
|
|
|
key: test_precision
|
|
value: [0.98507463 0.95588235 0.98529412 0.97101449 0.95588235 0.97014925
|
|
0.94285714 0.98507463 0.90410959 0.97101449]
|
|
|
|
mean value: 0.9626353048397583
|
|
|
|
key: train_precision
|
|
value: [0.98538961 0.98699187 0.98538961 0.98694943 0.98220065 0.98379254
|
|
0.98699187 0.98220065 0.99021207 0.98376623]
|
|
|
|
mean value: 0.9853884534267398
|
|
|
|
key: test_recall
|
|
value: [0.97058824 0.95588235 0.97101449 0.97101449 0.95588235 0.95588235
|
|
0.97058824 0.97058824 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9677323103154305
|
|
|
|
key: train_recall
|
|
value: [0.98859935 0.98859935 0.99021207 0.98694943 0.98859935 0.98859935
|
|
0.98859935 0.98859935 0.98859935 0.98697068]
|
|
|
|
mean value: 0.9884327624594162
|
|
|
|
key: test_roc_auc
|
|
value: [0.97804774 0.95620205 0.97815431 0.97080136 0.95588235 0.96323529
|
|
0.95588235 0.97794118 0.93382353 0.97794118]
|
|
|
|
mean value: 0.9647911338448424
|
|
|
|
key: train_roc_auc
|
|
value: [0.98695873 0.98777439 0.98777705 0.98696006 0.98534202 0.98615635
|
|
0.98778502 0.98534202 0.98941368 0.98534202]
|
|
|
|
mean value: 0.9868851326577786
|
|
|
|
key: test_jcc
|
|
value: [0.95652174 0.91549296 0.95714286 0.94366197 0.91549296 0.92857143
|
|
0.91666667 0.95652174 0.88 0.95714286]
|
|
|
|
mean value: 0.9327215175108623
|
|
|
|
key: train_jcc
|
|
value: [0.97431782 0.97588424 0.97588424 0.9742351 0.9712 0.97275641
|
|
0.97588424 0.9712 0.97903226 0.97115385]
|
|
|
|
mean value: 0.9741548169278077
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.04003906 0.04326582 0.03964067 0.04014897 0.04897428 0.07177114
|
|
0.09451461 0.06656861 0.07353854 0.06714416]
|
|
|
|
mean value: 0.05856058597564697
|
|
|
|
key: score_time
|
|
value: [0.02529025 0.02421188 0.02648306 0.0236609 0.02287126 0.0362916
|
|
0.03969383 0.04353642 0.03938127 0.03975558]
|
|
|
|
mean value: 0.032117605209350586
|
|
|
|
key: test_mcc
|
|
value: [0.70934757 0.64091263 0.68322489 0.7614264 0.73817324 0.72066617
|
|
0.55785938 0.82352941 0.64423542 0.76000982]
|
|
|
|
mean value: 0.7039384936898134
|
|
|
|
key: train_mcc
|
|
value: [0.7243465 0.71902729 0.71154553 0.70652812 0.69918792 0.71142953
|
|
0.71981239 0.70040764 0.71095972 0.69961749]
|
|
|
|
mean value: 0.7102862122707756
|
|
|
|
key: test_accuracy
|
|
value: [0.8540146 0.81751825 0.83941606 0.87591241 0.86764706 0.86029412
|
|
0.77205882 0.91176471 0.81617647 0.875 ]
|
|
|
|
mean value: 0.8489802490339201
|
|
|
|
key: train_accuracy
|
|
value: [0.8598207 0.85737571 0.85330073 0.85167074 0.84690554 0.85260586
|
|
0.85749186 0.84771987 0.8534202 0.84771987]
|
|
|
|
mean value: 0.8528031081342965
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.82758621 0.84931507 0.88590604 0.87323944 0.86131387
|
|
0.79470199 0.91176471 0.83221477 0.88435374]
|
|
|
|
mean value: 0.8577538677268463
|
|
|
|
key: train_fscore
|
|
value: [0.86748844 0.86486486 0.86132512 0.85825545 0.85582822 0.86172651
|
|
0.86528099 0.85626441 0.86089645 0.85559846]
|
|
|
|
mean value: 0.8607528903638494
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.77922078 0.80519481 0.825 0.83783784 0.85507246
|
|
0.72289157 0.91176471 0.7654321 0.82278481]
|
|
|
|
mean value: 0.8158532400394299
|
|
|
|
key: train_precision
|
|
value: [0.82309942 0.82232012 0.81605839 0.82116244 0.80869565 0.81151079
|
|
0.82043796 0.81077147 0.81911765 0.81350954]
|
|
|
|
mean value: 0.8166683432704045
|
|
|
|
key: test_recall
|
|
value: [0.88235294 0.88235294 0.89855072 0.95652174 0.91176471 0.86764706
|
|
0.88235294 0.91176471 0.91176471 0.95588235]
|
|
|
|
mean value: 0.9060954816709292
|
|
|
|
key: train_recall
|
|
value: [0.91693811 0.91205212 0.91190865 0.89885808 0.90879479 0.91856678
|
|
0.91530945 0.90716612 0.90716612 0.90228013]
|
|
|
|
mean value: 0.9099040336679225
|
|
|
|
key: test_roc_auc
|
|
value: [0.85421995 0.81798806 0.83898124 0.87531969 0.86764706 0.86029412
|
|
0.77205882 0.91176471 0.81617647 0.875 ]
|
|
|
|
mean value: 0.8489450127877238
|
|
|
|
key: train_roc_auc
|
|
value: [0.85977411 0.85733112 0.85334846 0.85170917 0.84690554 0.85260586
|
|
0.85749186 0.84771987 0.8534202 0.84771987]
|
|
|
|
mean value: 0.8528026048004421
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.70588235 0.73809524 0.79518072 0.775 0.75641026
|
|
0.65934066 0.83783784 0.71264368 0.79268293]
|
|
|
|
mean value: 0.7523073672506922
|
|
|
|
key: train_jcc
|
|
value: [0.76598639 0.76190476 0.7564276 0.75170532 0.74798928 0.75704698
|
|
0.76255088 0.74865591 0.75576662 0.74763833]
|
|
|
|
mean value: 0.7555672081895808
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [9.29590559 7.42686653 8.49374866 7.30686116 8.50415301 6.25112653
|
|
3.41065454 2.65957332 6.32901025 8.18140841]
|
|
|
|
mean value: 6.785930800437927
|
|
|
|
key: score_time
|
|
value: [0.02076149 0.03504729 0.02796507 0.03136373 0.01840234 0.02318001
|
|
0.01379251 0.01399994 0.03346467 0.01805067]
|
|
|
|
mean value: 0.023602771759033202
|
|
|
|
key: test_mcc
|
|
value: [0.98550418 0.91281179 0.89863497 0.95629932 0.95681396 0.92657079
|
|
0.89715584 0.95598573 0.91533482 0.95681396]
|
|
|
|
mean value: 0.9361925353808127
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.95620438 0.94890511 0.97810219 0.97794118 0.96323529
|
|
0.94852941 0.97794118 0.95588235 0.97794118]
|
|
|
|
mean value: 0.9677382996994418
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.99259259 0.95652174 0.95035461 0.97841727 0.97841727 0.96350365
|
|
0.94890511 0.97810219 0.95774648 0.97841727]
|
|
|
|
mean value: 0.9682978167991605
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.94285714 0.93055556 0.97142857 0.95774648 0.95652174
|
|
0.94202899 0.97101449 0.91891892 0.95774648]
|
|
|
|
mean value: 0.9548818363897972
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.98529412 0.97058824 0.97101449 0.98550725 1. 0.97058824
|
|
0.95588235 0.98529412 1. 1. ]
|
|
|
|
mean value: 0.9824168797953965
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.99264706 0.95630861 0.94874254 0.97804774 0.97794118 0.96323529
|
|
0.94852941 0.97794118 0.95588235 0.97794118]
|
|
|
|
mean value: 0.9677216538789429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.98529412 0.91666667 0.90540541 0.95774648 0.95774648 0.92957746
|
|
0.90277778 0.95714286 0.91891892 0.95774648]
|
|
|
|
mean value: 0.9389022644967135
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.12504745 0.15010977 0.14459181 0.12664557 0.143291 0.1131978
|
|
0.13656402 0.12558675 0.13988376 0.12933922]
|
|
|
|
mean value: 0.13342571258544922
|
|
|
|
key: score_time
|
|
value: [0.02595425 0.02889538 0.01287436 0.02456045 0.02117848 0.02104068
|
|
0.02054429 0.03904009 0.02880216 0.04108047]
|
|
|
|
mean value: 0.026397061347961426
|
|
|
|
key: test_mcc
|
|
value: [0.94199209 0.85739162 0.89863497 0.88654289 0.84271225 0.86849267
|
|
0.78632938 0.89715584 0.88580789 0.8722811 ]
|
|
|
|
mean value: 0.8737340702606493
|
|
|
|
key: train_mcc
|
|
value: [0.91751286 0.92541695 0.91414846 0.92215919 0.92047016 0.92054835
|
|
0.92539568 0.91263814 0.92073412 0.91728977]
|
|
|
|
mean value: 0.9196313681335729
|
|
|
|
key: test_accuracy
|
|
value: [0.97080292 0.9270073 0.94890511 0.94160584 0.91911765 0.93382353
|
|
0.88970588 0.94852941 0.94117647 0.93382353]
|
|
|
|
mean value: 0.9354497638471446
|
|
|
|
key: train_accuracy
|
|
value: [0.95843521 0.96251019 0.95680522 0.9608802 0.96009772 0.96009772
|
|
0.96254072 0.95602606 0.96009772 0.95846906]
|
|
|
|
mean value: 0.9595959797073979
|
|
|
|
key: test_fscore
|
|
value: [0.97014925 0.92957746 0.95035461 0.94444444 0.92307692 0.9352518
|
|
0.89655172 0.94814815 0.94366197 0.93706294]
|
|
|
|
mean value: 0.9378279275711674
|
|
|
|
key: train_fscore
|
|
value: [0.95923261 0.96308186 0.957498 0.96141479 0.96057924 0.96064257
|
|
0.96302251 0.9568 0.96076861 0.95903614]
|
|
|
|
mean value: 0.9602076343607397
|
|
|
|
key: test_precision
|
|
value: [0.98484848 0.89189189 0.93055556 0.90666667 0.88 0.91549296
|
|
0.84415584 0.95522388 0.90540541 0.89333333]
|
|
|
|
mean value: 0.9107574020200675
|
|
|
|
key: train_precision
|
|
value: [0.94191523 0.94936709 0.94164038 0.94770206 0.9491256 0.94770206
|
|
0.95079365 0.94025157 0.94488189 0.94611727]
|
|
|
|
mean value: 0.9459496798466626
|
|
|
|
key: test_recall
|
|
value: [0.95588235 0.97058824 0.97101449 0.98550725 0.97058824 0.95588235
|
|
0.95588235 0.94117647 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9677109974424553
|
|
|
|
key: train_recall
|
|
value: [0.9771987 0.9771987 0.97389886 0.97553018 0.9723127 0.97394137
|
|
0.97557003 0.97394137 0.9771987 0.9723127 ]
|
|
|
|
mean value: 0.9749103304621368
|
|
|
|
key: test_roc_auc
|
|
value: [0.9706948 0.9273231 0.94874254 0.94128303 0.91911765 0.93382353
|
|
0.88970588 0.94852941 0.94117647 0.93382353]
|
|
|
|
mean value: 0.9354219948849105
|
|
|
|
key: train_roc_auc
|
|
value: [0.9584199 0.96249821 0.95681914 0.96089213 0.96009772 0.96009772
|
|
0.96254072 0.95602606 0.96009772 0.95846906]
|
|
|
|
mean value: 0.9595958361451928
|
|
|
|
key: test_jcc
|
|
value: [0.94202899 0.86842105 0.90540541 0.89473684 0.85714286 0.87837838
|
|
0.8125 0.90140845 0.89333333 0.88157895]
|
|
|
|
mean value: 0.8834934252576709
|
|
|
|
key: train_jcc
|
|
value: [0.92165899 0.92879257 0.91846154 0.92569659 0.92414861 0.92426584
|
|
0.92868217 0.91717791 0.92449923 0.9212963 ]
|
|
|
|
mean value: 0.9234679748417127
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0395906 0.04507875 0.05771208 0.04412246 0.0433681 0.04452419
|
|
0.05319262 0.03797746 0.05140471 0.03866053]
|
|
|
|
mean value: 0.045563149452209475
|
|
|
|
key: score_time
|
|
value: [0.03348851 0.02256203 0.02129936 0.0222764 0.0238862 0.03728747
|
|
0.0332911 0.0254643 0.02613497 0.02396679]
|
|
|
|
mean value: 0.02696571350097656
|
|
|
|
key: test_mcc
|
|
value: [0.70801364 0.59804827 0.73858362 0.73747083 0.76503685 0.7972271
|
|
0.69305253 0.75008111 0.69125122 0.85628096]
|
|
|
|
mean value: 0.7335046136648194
|
|
|
|
key: train_mcc
|
|
value: [0.74611094 0.7540916 0.73661201 0.73642248 0.73671937 0.73495768
|
|
0.73997919 0.73002908 0.73997919 0.72517997]
|
|
|
|
mean value: 0.738008152481437
|
|
|
|
key: test_accuracy
|
|
value: [0.8540146 0.79562044 0.86861314 0.86861314 0.88235294 0.89705882
|
|
0.84558824 0.875 0.84558824 0.92647059]
|
|
|
|
mean value: 0.8658920137398025
|
|
|
|
key: train_accuracy
|
|
value: [0.87286064 0.87693562 0.86797066 0.86797066 0.86807818 0.86726384
|
|
0.86970684 0.86482085 0.86970684 0.86237785]
|
|
|
|
mean value: 0.868769196870628
|
|
|
|
key: test_fscore
|
|
value: [0.85294118 0.80821918 0.86567164 0.87142857 0.88059701 0.90140845
|
|
0.85106383 0.87591241 0.84671533 0.92957746]
|
|
|
|
mean value: 0.8683535065204239
|
|
|
|
key: train_fscore
|
|
value: [0.875 0.87851971 0.87060703 0.87019231 0.87060703 0.8694956
|
|
0.87220447 0.86698718 0.87220447 0.86469175]
|
|
|
|
mean value: 0.8710509550632397
|
|
|
|
key: test_precision
|
|
value: [0.85294118 0.75641026 0.89230769 0.85915493 0.89393939 0.86486486
|
|
0.82191781 0.86956522 0.84057971 0.89189189]
|
|
|
|
mean value: 0.8543572941217562
|
|
|
|
key: train_precision
|
|
value: [0.86119874 0.86804452 0.85289515 0.85511811 0.85423197 0.85511811
|
|
0.85579937 0.8533123 0.85579937 0.8503937 ]
|
|
|
|
mean value: 0.8561911347045577
|
|
|
|
key: test_recall
|
|
value: [0.85294118 0.86764706 0.84057971 0.88405797 0.86764706 0.94117647
|
|
0.88235294 0.88235294 0.85294118 0.97058824]
|
|
|
|
mean value: 0.884228473998295
|
|
|
|
key: train_recall
|
|
value: [0.88925081 0.88925081 0.88907015 0.8858075 0.88762215 0.88436482
|
|
0.88925081 0.88110749 0.88925081 0.87947883]
|
|
|
|
mean value: 0.8864454198128496
|
|
|
|
key: test_roc_auc
|
|
value: [0.85400682 0.79614237 0.86881927 0.86849957 0.88235294 0.89705882
|
|
0.84558824 0.875 0.84558824 0.92647059]
|
|
|
|
mean value: 0.8659526854219949
|
|
|
|
key: train_roc_auc
|
|
value: [0.87284727 0.87692557 0.86798784 0.86798519 0.86807818 0.86726384
|
|
0.86970684 0.86482085 0.86970684 0.86237785]
|
|
|
|
mean value: 0.8687700261967893
|
|
|
|
key: test_jcc
|
|
value: [0.74358974 0.67816092 0.76315789 0.7721519 0.78666667 0.82051282
|
|
0.74074074 0.77922078 0.73417722 0.86842105]
|
|
|
|
mean value: 0.7686799731563452
|
|
|
|
key: train_jcc
|
|
value: [0.77777778 0.78335725 0.7708628 0.77021277 0.7708628 0.76912181
|
|
0.7733711 0.76520509 0.7733711 0.76163611]
|
|
|
|
mean value: 0.771577861199781
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02968049 0.07493019 0.06887722 0.09048057 0.08637094 0.05483675
|
|
0.0843811 0.06575966 0.05945992 0.07291675]
|
|
|
|
mean value: 0.06876935958862304
|
|
|
|
key: score_time
|
|
value: [0.0133028 0.04360104 0.03298616 0.02097654 0.01297069 0.02053523
|
|
0.03294492 0.01312232 0.02019286 0.020895 ]
|
|
|
|
mean value: 0.023152756690979003
|
|
|
|
key: test_mcc
|
|
value: [0.69429215 0.75857279 0.78324384 0.87308606 0.81150267 0.86849267
|
|
0.72627304 0.92657079 0.60999428 0.79909587]
|
|
|
|
mean value: 0.7851124168520672
|
|
|
|
key: train_mcc
|
|
value: [0.66621184 0.8643186 0.82741367 0.90240059 0.77738194 0.91019134
|
|
0.81688487 0.8969845 0.76596494 0.7890932 ]
|
|
|
|
mean value: 0.821684548538899
|
|
|
|
key: test_accuracy
|
|
value: [0.82481752 0.86861314 0.88321168 0.93430657 0.89705882 0.93382353
|
|
0.85294118 0.96323529 0.79411765 0.88970588]
|
|
|
|
mean value: 0.8841831258050665
|
|
|
|
key: train_accuracy
|
|
value: [0.80929095 0.92909535 0.90953545 0.95028525 0.87785016 0.95439739
|
|
0.9014658 0.94788274 0.87214984 0.88517915]
|
|
|
|
mean value: 0.903713209039818
|
|
|
|
key: test_fscore
|
|
value: [0.85 0.88157895 0.87096774 0.93793103 0.90666667 0.9352518
|
|
0.86842105 0.96296296 0.76271186 0.90066225]
|
|
|
|
mean value: 0.8877154320671432
|
|
|
|
key: train_fscore
|
|
value: [0.83928571 0.93312836 0.90254609 0.95177866 0.89067055 0.95562599
|
|
0.90976883 0.94920635 0.85503232 0.89639971]
|
|
|
|
mean value: 0.9083442572874197
|
|
|
|
key: test_precision
|
|
value: [0.73913043 0.79761905 0.98181818 0.89473684 0.82926829 0.91549296
|
|
0.78571429 0.97014925 0.9 0.81927711]
|
|
|
|
mean value: 0.8633206404633871
|
|
|
|
key: train_precision
|
|
value: [0.72565321 0.88355167 0.97718631 0.92331288 0.8060686 0.93055556
|
|
0.83906465 0.92569659 0.98720682 0.81659973]
|
|
|
|
mean value: 0.8814896031917655
|
|
|
|
key: test_recall
|
|
value: [1. 0.98529412 0.7826087 0.98550725 1. 0.95588235
|
|
0.97058824 0.95588235 0.66176471 1. ]
|
|
|
|
mean value: 0.9297527706734868
|
|
|
|
key: train_recall
|
|
value: [0.99511401 0.98859935 0.83849918 0.98205546 0.99511401 0.98208469
|
|
0.99348534 0.97394137 0.75407166 0.99348534]
|
|
|
|
mean value: 0.9496450414738218
|
|
|
|
key: test_roc_auc
|
|
value: [0.82608696 0.86945865 0.88395141 0.93393009 0.89705882 0.93382353
|
|
0.85294118 0.96323529 0.79411765 0.88970588]
|
|
|
|
mean value: 0.8844309462915602
|
|
|
|
key: train_roc_auc
|
|
value: [0.80913938 0.92904682 0.90947761 0.95031112 0.87785016 0.95439739
|
|
0.9014658 0.94788274 0.87214984 0.88517915]
|
|
|
|
mean value: 0.9036900011158876
|
|
|
|
key: test_jcc
|
|
value: [0.73913043 0.78823529 0.77142857 0.88311688 0.82926829 0.87837838
|
|
0.76744186 0.92857143 0.61643836 0.81927711]
|
|
|
|
mean value: 0.8021286608141679
|
|
|
|
key: train_jcc
|
|
value: [0.72307692 0.87463977 0.8224 0.90799397 0.80289093 0.91502276
|
|
0.83447332 0.90332326 0.74677419 0.81225033]
|
|
|
|
mean value: 0.8342845467581183
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05821252 0.07661438 0.08219218 0.11488438 0.11355114 0.05209708
|
|
0.07517815 0.06350946 0.06845737 0.0433135 ]
|
|
|
|
mean value: 0.07480101585388184
|
|
|
|
key: score_time
|
|
value: [0.02136469 0.02088022 0.02089167 0.04012966 0.02557182 0.02096868
|
|
0.0201993 0.01298022 0.01303434 0.02602172]
|
|
|
|
mean value: 0.02220423221588135
|
|
|
|
key: test_mcc
|
|
value: [0.90025835 0.75258453 0.89869927 0.9001543 0.71492035 0.76894131
|
|
0.79549513 0.84567499 0.78144702 0.76249285]
|
|
|
|
mean value: 0.812066810336383
|
|
|
|
key: train_mcc
|
|
value: [0.8810362 0.8013501 0.9022761 0.9186774 0.71018517 0.70887969
|
|
0.8969845 0.77059101 0.89396869 0.72834633]
|
|
|
|
mean value: 0.8212295193778039
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.86131387 0.94890511 0.94890511 0.83823529 0.875
|
|
0.89705882 0.91911765 0.88235294 0.86764706]
|
|
|
|
mean value: 0.8987440961786174
|
|
|
|
key: train_accuracy
|
|
value: [0.93887531 0.89242054 0.95110024 0.9592502 0.83631922 0.83550489
|
|
0.94788274 0.8737785 0.94543974 0.8485342 ]
|
|
|
|
mean value: 0.9029105575156163
|
|
|
|
key: test_fscore
|
|
value: [0.95035461 0.87741935 0.94814815 0.95104895 0.86075949 0.88741722
|
|
0.89393939 0.92413793 0.89333333 0.88311688]
|
|
|
|
mean value: 0.9069675317602913
|
|
|
|
key: train_fscore
|
|
value: [0.94145199 0.90236686 0.95073892 0.95961228 0.85894737 0.85834502
|
|
0.94648829 0.88743646 0.94761532 0.86770982]
|
|
|
|
mean value: 0.9120712328049019
|
|
|
|
key: test_precision
|
|
value: [0.91780822 0.7816092 0.96969697 0.91891892 0.75555556 0.80722892
|
|
0.921875 0.87012987 0.81707317 0.79069767]
|
|
|
|
mean value: 0.8550593489694658
|
|
|
|
key: train_precision
|
|
value: [0.90404798 0.82655827 0.95702479 0.9504 0.75462392 0.75369458
|
|
0.97250859 0.80078637 0.9112782 0.77020202]
|
|
|
|
mean value: 0.8601124713698691
|
|
|
|
key: test_recall
|
|
value: [0.98529412 1. 0.92753623 0.98550725 1. 0.98529412
|
|
0.86764706 0.98529412 0.98529412 1. ]
|
|
|
|
mean value: 0.9721867007672634
|
|
|
|
key: train_recall
|
|
value: [0.98208469 0.99348534 0.94453507 0.96900489 0.99674267 0.99674267
|
|
0.9218241 0.99511401 0.98697068 0.99348534]
|
|
|
|
mean value: 0.9779989478774224
|
|
|
|
key: test_roc_auc
|
|
value: [0.9491688 0.86231884 0.94906223 0.94863598 0.83823529 0.875
|
|
0.89705882 0.91911765 0.88235294 0.86764706]
|
|
|
|
mean value: 0.8988597612958227
|
|
|
|
key: train_roc_auc
|
|
value: [0.93884006 0.8923381 0.9510949 0.95925815 0.83631922 0.83550489
|
|
0.94788274 0.8737785 0.94543974 0.8485342 ]
|
|
|
|
mean value: 0.9028990493700548
|
|
|
|
key: test_jcc
|
|
value: [0.90540541 0.7816092 0.90140845 0.90666667 0.75555556 0.79761905
|
|
0.80821918 0.85897436 0.80722892 0.79069767]
|
|
|
|
mean value: 0.8313384448491006
|
|
|
|
key: train_jcc
|
|
value: [0.88938053 0.82210243 0.90610329 0.92236025 0.75276753 0.75184275
|
|
0.8984127 0.79765013 0.90044577 0.76633166]
|
|
|
|
mean value: 0.8407397023682444
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.67593026 0.74290133 0.6617012 0.73324609 0.68701911 0.72807765
|
|
0.84482217 0.75904655 0.80371714 0.76402116]
|
|
|
|
mean value: 0.7400482654571533
|
|
|
|
key: score_time
|
|
value: [0.0317471 0.0234251 0.02291584 0.02281976 0.02341962 0.02329421
|
|
0.03307962 0.0250566 0.02536154 0.02511978]
|
|
|
|
mean value: 0.02562391757965088
|
|
|
|
key: test_mcc
|
|
value: [0.92787101 0.89791134 0.89863497 0.84156943 0.92657079 0.88273483
|
|
0.82352941 0.92657079 0.89949371 0.91533482]
|
|
|
|
mean value: 0.8940221094416068
|
|
|
|
key: train_mcc
|
|
value: [0.95764031 0.94948177 0.95764076 0.95602223 0.96091715 0.95444297
|
|
0.96255221 0.95114007 0.96254199 0.9527801 ]
|
|
|
|
mean value: 0.9565159566173079
|
|
|
|
key: test_accuracy
|
|
value: [0.96350365 0.94890511 0.94890511 0.91970803 0.96323529 0.94117647
|
|
0.91176471 0.96323529 0.94852941 0.95588235]
|
|
|
|
mean value: 0.9464845427221984
|
|
|
|
key: train_accuracy
|
|
value: [0.97881011 0.97473513 0.97881011 0.97799511 0.98045603 0.9771987
|
|
0.98127036 0.97557003 0.98127036 0.97638436]
|
|
|
|
mean value: 0.9782500285381309
|
|
|
|
key: test_fscore
|
|
value: [0.96240602 0.94890511 0.95035461 0.92307692 0.96350365 0.94202899
|
|
0.91176471 0.96350365 0.95035461 0.95774648]
|
|
|
|
mean value: 0.9473644736994636
|
|
|
|
key: train_fscore
|
|
value: [0.9788961 0.97469388 0.97886179 0.97806661 0.98042414 0.97730956
|
|
0.981316 0.97557003 0.9812856 0.97644192]
|
|
|
|
mean value: 0.9782865639540559
|
|
|
|
key: test_precision
|
|
value: [0.98461538 0.94202899 0.93055556 0.89189189 0.95652174 0.92857143
|
|
0.91176471 0.95652174 0.91780822 0.91891892]
|
|
|
|
mean value: 0.933919856838173
|
|
|
|
key: train_precision
|
|
value: [0.97572816 0.97708674 0.97568882 0.97411003 0.98202614 0.97258065
|
|
0.97893031 0.97557003 0.9804878 0.97406807]
|
|
|
|
mean value: 0.9766276753260145
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.95588235 0.97101449 0.95652174 0.97058824 0.95588235
|
|
0.91176471 0.97058824 0.98529412 1. ]
|
|
|
|
mean value: 0.9618712702472293
|
|
|
|
key: train_recall
|
|
value: [0.98208469 0.9723127 0.98205546 0.98205546 0.97882736 0.98208469
|
|
0.98371336 0.97557003 0.98208469 0.97882736]
|
|
|
|
mean value: 0.9799615815846666
|
|
|
|
key: test_roc_auc
|
|
value: [0.96334186 0.94895567 0.94874254 0.91943734 0.96323529 0.94117647
|
|
0.91176471 0.96323529 0.94852941 0.95588235]
|
|
|
|
mean value: 0.9464300937766411
|
|
|
|
key: train_roc_auc
|
|
value: [0.97880743 0.9747371 0.97881275 0.97799842 0.98045603 0.9771987
|
|
0.98127036 0.97557003 0.98127036 0.97638436]
|
|
|
|
mean value: 0.9782505539584783
|
|
|
|
key: test_jcc
|
|
value: [0.92753623 0.90277778 0.90540541 0.85714286 0.92957746 0.89041096
|
|
0.83783784 0.92957746 0.90540541 0.91891892]
|
|
|
|
mean value: 0.9004590322853835
|
|
|
|
key: train_jcc
|
|
value: [0.95866455 0.95063694 0.95859873 0.95707472 0.9616 0.95562599
|
|
0.96331738 0.95230525 0.96325879 0.95396825]
|
|
|
|
mean value: 0.9575050598665193
|
|
|
|
MCC on Blind test: 0.66
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.35942435 0.37208867 0.38434052 0.42646241 0.40576959 0.39391208
|
|
0.37551808 0.40830898 0.34282279 0.383919 ]
|
|
|
|
mean value: 0.38525664806365967
|
|
|
|
key: score_time
|
|
value: [0.02972317 0.03043103 0.03320718 0.03213596 0.03038836 0.02972698
|
|
0.03265667 0.03124976 0.03130412 0.03155637]
|
|
|
|
mean value: 0.03123795986175537
|
|
|
|
key: test_mcc
|
|
value: [0.8978896 0.89791134 0.92791659 0.88360693 0.89715584 0.89715584
|
|
0.83905224 0.8979331 0.88388348 0.91215932]
|
|
|
|
mean value: 0.8934664277275621
|
|
|
|
key: train_mcc
|
|
value: [0.99188303 0.99837134 0.9886543 0.9967453 0.99186852 0.98860066
|
|
0.98860066 0.99185799 0.99349061 0.99185799]
|
|
|
|
mean value: 0.9921930405041824
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.94890511 0.96350365 0.94160584 0.94852941 0.94852941
|
|
0.91911765 0.94852941 0.94117647 0.95588235]
|
|
|
|
mean value: 0.9464684413911549
|
|
|
|
key: train_accuracy
|
|
value: [0.99592502 0.999185 0.99429503 0.99837001 0.99592834 0.99429967
|
|
0.99429967 0.99592834 0.99674267 0.99592834]
|
|
|
|
mean value: 0.9960902096955313
|
|
|
|
key: test_fscore
|
|
value: [0.94814815 0.94890511 0.96296296 0.94117647 0.94890511 0.94890511
|
|
0.92086331 0.94964029 0.94285714 0.95652174]
|
|
|
|
mean value: 0.946888538927638
|
|
|
|
key: train_fscore
|
|
value: [0.99591169 0.999185 0.99425759 0.99836601 0.99591837 0.99429503
|
|
0.99429503 0.99593165 0.99673736 0.99593165]
|
|
|
|
mean value: 0.9960829383048007
|
|
|
|
key: test_precision
|
|
value: [0.95522388 0.94202899 0.98484848 0.95522388 0.94202899 0.94202899
|
|
0.90140845 0.92957746 0.91666667 0.94285714]
|
|
|
|
mean value: 0.941189292758102
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.99836334 0.99510604
|
|
0.99510604 0.99512195 0.99836601 0.99512195]
|
|
|
|
mean value: 0.9977185326077931
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.95588235 0.94202899 0.92753623 0.95588235 0.95588235
|
|
0.94117647 0.97058824 0.97058824 0.97058824]
|
|
|
|
mean value: 0.9531329923273657
|
|
|
|
key: train_recall
|
|
value: [0.99185668 0.99837134 0.98858075 0.99673736 0.99348534 0.99348534
|
|
0.99348534 0.99674267 0.99511401 0.99674267]
|
|
|
|
mean value: 0.994460149528936
|
|
|
|
key: test_roc_auc
|
|
value: [0.9488491 0.94895567 0.96366155 0.94170929 0.94852941 0.94852941
|
|
0.91911765 0.94852941 0.94117647 0.95588235]
|
|
|
|
mean value: 0.946494032395567
|
|
|
|
key: train_roc_auc
|
|
value: [0.99592834 0.99918567 0.99429038 0.99836868 0.99592834 0.99429967
|
|
0.99429967 0.99592834 0.99674267 0.99592834]
|
|
|
|
mean value: 0.9960900096178882
|
|
|
|
key: test_jcc
|
|
value: [0.90140845 0.90277778 0.92857143 0.88888889 0.90277778 0.90277778
|
|
0.85333333 0.90410959 0.89189189 0.91666667]
|
|
|
|
mean value: 0.8993203582430864
|
|
|
|
key: train_jcc
|
|
value: [0.99185668 0.99837134 0.98858075 0.99673736 0.99186992 0.98865478
|
|
0.98865478 0.99189627 0.99349593 0.99189627]
|
|
|
|
mean value: 0.9922014081324269
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.45340419 1.60676646 1.59989524 1.58247232 1.4779551 1.49836159
|
|
1.49765706 1.47903895 1.51418066 1.45867777]
|
|
|
|
mean value: 1.516840934753418
|
|
|
|
key: score_time
|
|
value: [0.08413672 0.05650353 0.07431793 0.08095384 0.07365823 0.07279754
|
|
0.07393479 0.08042669 0.06745076 0.0800302 ]
|
|
|
|
mean value: 0.07442102432250977
|
|
|
|
key: test_mcc
|
|
value: [0.73747083 0.77188355 0.92709446 0.87086187 0.8979331 0.82675403
|
|
0.7540057 0.79411765 0.73656956 0.92737353]
|
|
|
|
mean value: 0.8244064275741751
|
|
|
|
key: train_mcc
|
|
value: [0.95605202 0.96748206 0.9593539 0.95780409 0.95608819 0.95631148
|
|
0.96103952 0.96435357 0.96260328 0.9528711 ]
|
|
|
|
mean value: 0.9593959205340611
|
|
|
|
key: test_accuracy
|
|
value: [0.86861314 0.88321168 0.96350365 0.93430657 0.94852941 0.91176471
|
|
0.875 0.89705882 0.86764706 0.96323529]
|
|
|
|
mean value: 0.9112870330613998
|
|
|
|
key: train_accuracy
|
|
value: [0.97799511 0.98370008 0.9796251 0.97881011 0.97801303 0.97801303
|
|
0.98045603 0.98208469 0.98127036 0.97638436]
|
|
|
|
mean value: 0.9796351897719339
|
|
|
|
key: test_fscore
|
|
value: [0.86567164 0.88888889 0.96402878 0.93706294 0.94964029 0.91549296
|
|
0.88111888 0.89705882 0.87142857 0.96402878]
|
|
|
|
mean value: 0.9134420543292832
|
|
|
|
key: train_fscore
|
|
value: [0.97813765 0.98381877 0.97975709 0.97899838 0.97813765 0.97827836
|
|
0.98061389 0.98225806 0.98137652 0.97655618]
|
|
|
|
mean value: 0.9797932562619014
|
|
|
|
key: test_precision
|
|
value: [0.87878788 0.84210526 0.95714286 0.90540541 0.92957746 0.87837838
|
|
0.84 0.89705882 0.84722222 0.94366197]
|
|
|
|
mean value: 0.8919340265243767
|
|
|
|
key: train_precision
|
|
value: [0.9726248 0.97749196 0.97266881 0.9696 0.9726248 0.96661367
|
|
0.97275641 0.97284345 0.97584541 0.96950241]
|
|
|
|
mean value: 0.9722571720692034
|
|
|
|
key: test_recall
|
|
value: [0.85294118 0.94117647 0.97101449 0.97101449 0.97058824 0.95588235
|
|
0.92647059 0.89705882 0.89705882 0.98529412]
|
|
|
|
mean value: 0.936849957374254
|
|
|
|
key: train_recall
|
|
value: [0.98371336 0.99022801 0.98694943 0.98858075 0.98371336 0.99022801
|
|
0.98859935 0.99185668 0.98697068 0.98371336]
|
|
|
|
mean value: 0.9874552980748282
|
|
|
|
key: test_roc_auc
|
|
value: [0.86849957 0.88363171 0.96344842 0.93403666 0.94852941 0.91176471
|
|
0.875 0.89705882 0.86764706 0.96323529]
|
|
|
|
mean value: 0.9112851662404092
|
|
|
|
key: train_roc_auc
|
|
value: [0.97799045 0.98369476 0.97963107 0.97881806 0.97801303 0.97801303
|
|
0.98045603 0.98208469 0.98127036 0.97638436]
|
|
|
|
mean value: 0.9796355829981243
|
|
|
|
key: test_jcc
|
|
value: [0.76315789 0.8 0.93055556 0.88157895 0.90410959 0.84415584
|
|
0.7875 0.81333333 0.7721519 0.93055556]
|
|
|
|
mean value: 0.8427098618480825
|
|
|
|
key: train_jcc
|
|
value: [0.95721078 0.96815287 0.96031746 0.95886076 0.95721078 0.95748031
|
|
0.96196513 0.96513471 0.96343402 0.95418641]
|
|
|
|
mean value: 0.9603953231785132
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [3.24940395 3.52079797 3.22503304 3.50924063 3.05567026 2.2908988
|
|
2.29794836 2.28992009 2.27863598 2.27230477]
|
|
|
|
mean value: 2.7989853858947753
|
|
|
|
key: score_time
|
|
value: [0.01407909 0.01406145 0.01401496 0.01402545 0.00979352 0.00969934
|
|
0.00976634 0.0096848 0.00975347 0.009619 ]
|
|
|
|
mean value: 0.011449742317199706
|
|
|
|
key: test_mcc
|
|
value: [0.95629932 0.8687127 0.8978896 0.89863497 0.91176471 0.88273483
|
|
0.83832595 0.91215932 0.90184995 0.94280904]
|
|
|
|
mean value: 0.901118038297138
|
|
|
|
key: train_mcc
|
|
value: [0.99022004 0.99185136 0.98859135 0.99022004 0.99023327 0.99022801
|
|
0.99348534 0.99185799 0.99348534 0.99022801]
|
|
|
|
mean value: 0.9910400765606606
|
|
|
|
key: test_accuracy
|
|
value: [0.97810219 0.93430657 0.94890511 0.94890511 0.95588235 0.94117647
|
|
0.91911765 0.95588235 0.94852941 0.97058824]
|
|
|
|
mean value: 0.9501395448690425
|
|
|
|
key: train_accuracy
|
|
value: [0.99511002 0.99592502 0.99429503 0.99511002 0.99511401 0.99511401
|
|
0.99674267 0.99592834 0.99674267 0.99511401]
|
|
|
|
mean value: 0.9955195798125244
|
|
|
|
key: test_fscore
|
|
value: [0.97777778 0.93430657 0.94964029 0.95035461 0.95588235 0.94029851
|
|
0.91970803 0.95522388 0.95104895 0.97142857]
|
|
|
|
mean value: 0.9505669537495187
|
|
|
|
key: train_fscore
|
|
value: [0.99511401 0.99592502 0.99428571 0.99510604 0.99510604 0.99511401
|
|
0.99674267 0.99593165 0.99674267 0.99511401]
|
|
|
|
mean value: 0.9955181819751661
|
|
|
|
key: test_precision
|
|
value: [0.98507463 0.92753623 0.94285714 0.93055556 0.95588235 0.95454545
|
|
0.91304348 0.96969697 0.90666667 0.94444444]
|
|
|
|
mean value: 0.9430302923718009
|
|
|
|
key: train_precision
|
|
value: [0.99511401 0.99673736 0.99509804 0.99510604 0.99673203 0.99511401
|
|
0.99674267 0.99512195 0.99674267 0.99511401]
|
|
|
|
mean value: 0.9957622771290957
|
|
|
|
key: test_recall
|
|
value: [0.97058824 0.94117647 0.95652174 0.97101449 0.95588235 0.92647059
|
|
0.92647059 0.94117647 1. 1. ]
|
|
|
|
mean value: 0.9589300937766411
|
|
|
|
key: train_recall
|
|
value: [0.99511401 0.99511401 0.99347471 0.99510604 0.99348534 0.99511401
|
|
0.99674267 0.99674267 0.99674267 0.99511401]
|
|
|
|
mean value: 0.9952750131515322
|
|
|
|
key: test_roc_auc
|
|
value: [0.97804774 0.93435635 0.9488491 0.94874254 0.95588235 0.94117647
|
|
0.91911765 0.95588235 0.94852941 0.97058824]
|
|
|
|
mean value: 0.9501172208013641
|
|
|
|
key: train_roc_auc
|
|
value: [0.99511002 0.99592568 0.99429436 0.99511002 0.99511401 0.99511401
|
|
0.99674267 0.99592834 0.99674267 0.99511401]
|
|
|
|
mean value: 0.9955195785133188
|
|
|
|
key: test_jcc
|
|
value: [0.95652174 0.87671233 0.90410959 0.90540541 0.91549296 0.88732394
|
|
0.85135135 0.91428571 0.90666667 0.94444444]
|
|
|
|
mean value: 0.9062314140500687
|
|
|
|
key: train_jcc
|
|
value: [0.99027553 0.99188312 0.98863636 0.99025974 0.99025974 0.99027553
|
|
0.99350649 0.99189627 0.99350649 0.99027553]
|
|
|
|
mean value: 0.9910774800564104
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06899261 0.05181146 0.05468035 0.05141401 0.04903173 0.04878616
|
|
0.06030512 0.05749083 0.11495543 0.06219625]
|
|
|
|
mean value: 0.06196639537811279
|
|
|
|
key: score_time
|
|
value: [0.01346803 0.01373625 0.01389027 0.01407194 0.01384234 0.01389575
|
|
0.01589203 0.01564479 0.01802635 0.01396275]
|
|
|
|
mean value: 0.014643049240112305
|
|
|
|
key: test_mcc
|
|
value: [0.74077551 0.69429215 0.84660737 0.70450233 0.6799747 0.66012934
|
|
0.6 0.70321085 0.61134064 0.77459667]
|
|
|
|
mean value: 0.7015429553097181
|
|
|
|
key: train_mcc
|
|
value: [0.72381466 0.73489837 0.74503794 0.70437109 0.71316163 0.6643151
|
|
0.70797069 0.6911857 0.64653991 0.69633693]
|
|
|
|
mean value: 0.7027632013908158
|
|
|
|
key: test_accuracy
|
|
value: [0.8540146 0.82481752 0.91970803 0.83211679 0.81617647 0.80882353
|
|
0.76470588 0.83088235 0.77205882 0.875 ]
|
|
|
|
mean value: 0.8298303993130098
|
|
|
|
key: train_accuracy
|
|
value: [0.84433578 0.85167074 0.85737571 0.83211084 0.83713355 0.80618893
|
|
0.83387622 0.8232899 0.79478827 0.82654723]
|
|
|
|
mean value: 0.8307317176769166
|
|
|
|
key: test_fscore
|
|
value: [0.87179487 0.85 0.92517007 0.85714286 0.8447205 0.8375
|
|
0.80952381 0.85534591 0.81437126 0.88888889]
|
|
|
|
mean value: 0.8554458161706764
|
|
|
|
key: train_fscore
|
|
value: [0.86520819 0.87055477 0.87491065 0.85594406 0.85994398 0.83765348
|
|
0.8575419 0.84982699 0.82972973 0.85218598]
|
|
|
|
mean value: 0.8553499715201868
|
|
|
|
key: test_precision
|
|
value: [0.77272727 0.73913043 0.87179487 0.75 0.7311828 0.72826087
|
|
0.68 0.74725275 0.68686869 0.8 ]
|
|
|
|
mean value: 0.750721767869033
|
|
|
|
key: train_precision
|
|
value: [0.7633873 0.77272727 0.77862595 0.74908201 0.75429975 0.72065728
|
|
0.75061125 0.73886883 0.70900693 0.74244256]
|
|
|
|
mean value: 0.7479709134762966
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.98550725 1. 1. 0.98529412
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.997080136402387
|
|
|
|
key: train_recall
|
|
value: [0.99837134 0.99674267 0.99836868 0.99836868 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9991851363774038
|
|
|
|
key: test_roc_auc
|
|
value: [0.85507246 0.82608696 0.91922421 0.83088235 0.81617647 0.80882353
|
|
0.76470588 0.83088235 0.77205882 0.875 ]
|
|
|
|
mean value: 0.8298913043478261
|
|
|
|
key: train_roc_auc
|
|
value: [0.84421014 0.85155241 0.85749053 0.83224623 0.83713355 0.80618893
|
|
0.83387622 0.8232899 0.79478827 0.82654723]
|
|
|
|
mean value: 0.8307323410790102
|
|
|
|
key: test_jcc
|
|
value: [0.77272727 0.73913043 0.86075949 0.75 0.7311828 0.72043011
|
|
0.68 0.74725275 0.68686869 0.8 ]
|
|
|
|
mean value: 0.7488351538528009
|
|
|
|
key: train_jcc
|
|
value: [0.76243781 0.77078086 0.77763659 0.74816626 0.75429975 0.72065728
|
|
0.75061125 0.73886883 0.70900693 0.74244256]
|
|
|
|
mean value: 0.7474908124059836
|
|
|
|
MCC on Blind test: 0.02
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04560471 0.05994201 0.07156658 0.0524919 0.05347729 0.05321288
|
|
0.05366182 0.04788947 0.05366254 0.05350876]
|
|
|
|
mean value: 0.054501795768737794
|
|
|
|
key: score_time
|
|
value: [0.0202415 0.02025533 0.01818871 0.02007699 0.02113891 0.02003241
|
|
0.02011871 0.02007985 0.01999617 0.02001643]
|
|
|
|
mean value: 0.020014500617980956
|
|
|
|
key: test_mcc
|
|
value: [0.88355744 0.79705571 0.8978896 0.87308606 0.91334626 0.85442069
|
|
0.82928843 0.88273483 0.82928843 0.90184995]
|
|
|
|
mean value: 0.8662517400011381
|
|
|
|
key: train_mcc
|
|
value: [0.90300839 0.91267625 0.89468261 0.90436304 0.90139896 0.90309017
|
|
0.90601824 0.90152353 0.90623002 0.9029702 ]
|
|
|
|
mean value: 0.9035961418167106
|
|
|
|
key: test_accuracy
|
|
value: [0.94160584 0.89781022 0.94890511 0.93430657 0.95588235 0.92647059
|
|
0.91176471 0.94117647 0.91176471 0.94852941]
|
|
|
|
mean value: 0.9318215972520395
|
|
|
|
key: train_accuracy
|
|
value: [0.95110024 0.95599022 0.94702526 0.95191524 0.95032573 0.95114007
|
|
0.95276873 0.95032573 0.95276873 0.95114007]
|
|
|
|
mean value: 0.9514500025219743
|
|
|
|
key: test_fscore
|
|
value: [0.94029851 0.9 0.94964029 0.93793103 0.95714286 0.92857143
|
|
0.91666667 0.94029851 0.91666667 0.95104895]
|
|
|
|
mean value: 0.9338264907274486
|
|
|
|
key: train_fscore
|
|
value: [0.95215311 0.95686901 0.94795837 0.95268645 0.95131684 0.95215311
|
|
0.95352564 0.95139442 0.95367412 0.95207668]
|
|
|
|
mean value: 0.9523807745491089
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.875 0.94285714 0.89473684 0.93055556 0.90277778
|
|
0.86842105 0.95454545 0.86842105 0.90666667]
|
|
|
|
mean value: 0.9098526999316473
|
|
|
|
key: train_precision
|
|
value: [0.9328125 0.93887147 0.93081761 0.93690852 0.93270736 0.9328125
|
|
0.9384858 0.93135725 0.93573668 0.93416928]
|
|
|
|
mean value: 0.9344678970829278
|
|
|
|
key: test_recall
|
|
value: [0.92647059 0.92647059 0.95652174 0.98550725 0.98529412 0.95588235
|
|
0.97058824 0.92647059 0.97058824 1. ]
|
|
|
|
mean value: 0.9603793691389599
|
|
|
|
key: train_recall
|
|
value: [0.9723127 0.97557003 0.96574225 0.96900489 0.97068404 0.9723127
|
|
0.96905537 0.9723127 0.9723127 0.97068404]
|
|
|
|
mean value: 0.9709991444861868
|
|
|
|
key: test_roc_auc
|
|
value: [0.94149616 0.8980179 0.9488491 0.93393009 0.95588235 0.92647059
|
|
0.91176471 0.94117647 0.91176471 0.94852941]
|
|
|
|
mean value: 0.9317881500426257
|
|
|
|
key: train_roc_auc
|
|
value: [0.95108294 0.95597425 0.94704051 0.95192916 0.95032573 0.95114007
|
|
0.95276873 0.95032573 0.95276873 0.95114007]
|
|
|
|
mean value: 0.9514495911069073
|
|
|
|
key: test_jcc
|
|
value: [0.88732394 0.81818182 0.90410959 0.88311688 0.91780822 0.86666667
|
|
0.84615385 0.88732394 0.84615385 0.90666667]
|
|
|
|
mean value: 0.8763505422482849
|
|
|
|
key: train_jcc
|
|
value: [0.9086758 0.91730475 0.90106545 0.90964778 0.90715373 0.9086758
|
|
0.91117917 0.90729483 0.91145038 0.90853659]
|
|
|
|
mean value: 0.9090984275974558
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.68297338 0.58741832 0.45031023 0.64178634 0.64991045 0.64547777
|
|
0.64209366 0.61012626 0.55511141 0.53283501]
|
|
|
|
mean value: 0.5998042821884155
|
|
|
|
key: score_time
|
|
value: [0.02061009 0.02557373 0.02007389 0.02004433 0.02005386 0.01999784
|
|
0.02003598 0.01997614 0.02412701 0.01992822]
|
|
|
|
mean value: 0.0210421085357666
|
|
|
|
key: test_mcc
|
|
value: [0.8978896 0.87099729 0.91277477 0.9001543 0.8979331 0.85331034
|
|
0.78632938 0.91215932 0.88580789 0.89949371]
|
|
|
|
mean value: 0.8816849688659582
|
|
|
|
key: train_mcc
|
|
value: [0.91404667 0.92692597 0.91396326 0.91229254 0.91728977 0.91728977
|
|
0.92206383 0.91085966 0.91747489 0.90910351]
|
|
|
|
mean value: 0.91613098611342
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.93430657 0.95620438 0.94890511 0.94852941 0.92647059
|
|
0.88970588 0.95588235 0.94117647 0.94852941]
|
|
|
|
mean value: 0.9398615285530271
|
|
|
|
key: train_accuracy
|
|
value: [0.95680522 0.96332518 0.95680522 0.95599022 0.95846906 0.95846906
|
|
0.96091205 0.95521173 0.95846906 0.95439739]
|
|
|
|
mean value: 0.9578854174133038
|
|
|
|
key: test_fscore
|
|
value: [0.94814815 0.93617021 0.95714286 0.95104895 0.94964029 0.92753623
|
|
0.89655172 0.95522388 0.94366197 0.95035461]
|
|
|
|
mean value: 0.9415478875254766
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:136: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:139: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.957498 0.96379726 0.95736122 0.95652174 0.95903614 0.95903614
|
|
0.96135266 0.95589415 0.95916733 0.95498392]
|
|
|
|
mean value: 0.9584648570657469
|
|
|
|
key: test_precision
|
|
value: [0.95522388 0.90410959 0.94366197 0.91891892 0.92957746 0.91428571
|
|
0.84415584 0.96969697 0.90540541 0.91780822]
|
|
|
|
mean value: 0.9202843977898764
|
|
|
|
key: train_precision
|
|
value: [0.94312796 0.95230525 0.94444444 0.94435612 0.94611727 0.94611727
|
|
0.95063694 0.94154818 0.94330709 0.94285714]
|
|
|
|
mean value: 0.945481767751615
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.97058824 0.97101449 0.98550725 0.97058824 0.94117647
|
|
0.95588235 0.94117647 0.98529412 0.98529412]
|
|
|
|
mean value: 0.964769820971867
|
|
|
|
key: train_recall
|
|
value: [0.9723127 0.97557003 0.97063622 0.96900489 0.9723127 0.9723127
|
|
0.9723127 0.97068404 0.97557003 0.96742671]
|
|
|
|
mean value: 0.9718142737963027
|
|
|
|
key: test_roc_auc
|
|
value: [0.9488491 0.93456948 0.95609548 0.94863598 0.94852941 0.92647059
|
|
0.88970588 0.95588235 0.94117647 0.94852941]
|
|
|
|
mean value: 0.9398444160272805
|
|
|
|
key: train_roc_auc
|
|
value: [0.95679257 0.9633152 0.95681648 0.95600082 0.95846906 0.95846906
|
|
0.96091205 0.95521173 0.95846906 0.95439739]
|
|
|
|
mean value: 0.9578853398940438
|
|
|
|
key: test_jcc
|
|
value: [0.90140845 0.88 0.91780822 0.90666667 0.90410959 0.86486486
|
|
0.8125 0.91428571 0.89333333 0.90540541]
|
|
|
|
mean value: 0.8900382243479388
|
|
|
|
key: train_jcc
|
|
value: [0.91846154 0.93012422 0.91820988 0.91666667 0.9212963 0.9212963
|
|
0.9255814 0.91551459 0.92153846 0.91384615]
|
|
|
|
mean value: 0.9202535501533893
|
|
|
|
MCC on Blind test: 0.65
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04414916 0.08380175 0.07032514 0.07745457 0.05293298 0.05717969
|
|
0.05249119 0.05452228 0.08637619 0.11670065]
|
|
|
|
mean value: 0.06959335803985596
|
|
|
|
key: score_time
|
|
value: [0.01268363 0.01511669 0.01902628 0.01525545 0.01350904 0.01557541
|
|
0.01358795 0.0154953 0.01878643 0.01540351]
|
|
|
|
mean value: 0.015443968772888183
|
|
|
|
key: test_mcc
|
|
value: [0.812277 0.75191816 0.88320546 0.83947987 0.82495791 0.82675403
|
|
0.73817324 0.84051051 0.72443685 0.78017138]
|
|
|
|
mean value: 0.8021884424651033
|
|
|
|
key: train_mcc
|
|
value: [0.83048073 0.85004477 0.83701763 0.81907163 0.83553259 0.82747132
|
|
0.84527799 0.82736156 0.85343831 0.84366611]
|
|
|
|
mean value: 0.8369362649023049
|
|
|
|
key: test_accuracy
|
|
value: [0.90510949 0.87591241 0.94160584 0.91970803 0.91176471 0.91176471
|
|
0.86764706 0.91911765 0.86029412 0.88970588]
|
|
|
|
mean value: 0.9002629884070417
|
|
|
|
key: train_accuracy
|
|
value: [0.91524042 0.92502037 0.91850041 0.90953545 0.91775244 0.91368078
|
|
0.92263844 0.91368078 0.9267101 0.9218241 ]
|
|
|
|
mean value: 0.9184583303467847
|
|
|
|
key: test_fscore
|
|
value: [0.90076336 0.87591241 0.94202899 0.92086331 0.90909091 0.91549296
|
|
0.87323944 0.91603053 0.86713287 0.89208633]
|
|
|
|
mean value: 0.9012641098273885
|
|
|
|
key: train_fscore
|
|
value: [0.91530945 0.92520325 0.91816694 0.90938776 0.91808597 0.91297209
|
|
0.92270138 0.91368078 0.92694805 0.92156863]
|
|
|
|
mean value: 0.9184024291795301
|
|
|
|
key: test_precision
|
|
value: [0.93650794 0.86956522 0.94202899 0.91428571 0.9375 0.87837838
|
|
0.83783784 0.95238095 0.82666667 0.87323944]
|
|
|
|
mean value: 0.8968391125575755
|
|
|
|
key: train_precision
|
|
value: [0.91530945 0.9237013 0.92118227 0.91013072 0.91437803 0.9205298
|
|
0.92195122 0.91368078 0.92394822 0.92459016]
|
|
|
|
mean value: 0.9189401945593438
|
|
|
|
key: test_recall
|
|
value: [0.86764706 0.88235294 0.94202899 0.92753623 0.88235294 0.95588235
|
|
0.91176471 0.88235294 0.91176471 0.91176471]
|
|
|
|
mean value: 0.9075447570332481
|
|
|
|
key: train_recall
|
|
value: [0.91530945 0.9267101 0.91517129 0.908646 0.9218241 0.90553746
|
|
0.92345277 0.91368078 0.92996743 0.91856678]
|
|
|
|
mean value: 0.9178866151941378
|
|
|
|
key: test_roc_auc
|
|
value: [0.90483802 0.87595908 0.94160273 0.91965047 0.91176471 0.91176471
|
|
0.86764706 0.91911765 0.86029412 0.88970588]
|
|
|
|
mean value: 0.900234441602728
|
|
|
|
key: train_roc_auc
|
|
value: [0.91524037 0.925019 0.9184977 0.90953473 0.91775244 0.91368078
|
|
0.92263844 0.91368078 0.9267101 0.9218241 ]
|
|
|
|
mean value: 0.9184578433612659
|
|
|
|
key: test_jcc
|
|
value: [0.81944444 0.77922078 0.89041096 0.85333333 0.83333333 0.84415584
|
|
0.775 0.84507042 0.7654321 0.80519481]
|
|
|
|
mean value: 0.8210596019887293
|
|
|
|
key: train_jcc
|
|
value: [0.84384384 0.86081694 0.84871407 0.83383234 0.84857571 0.83987915
|
|
0.85649547 0.84107946 0.86384266 0.85454545]
|
|
|
|
mean value: 0.8491625104737037
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.32345986 1.43449521 1.27052402 1.54930925 1.47475839 1.37269783
|
|
1.32320762 1.29081202 1.4030478 1.44133997]
|
|
|
|
mean value: 1.3883651971817017
|
|
|
|
key: score_time
|
|
value: [0.01582932 0.02128839 0.01553774 0.01557136 0.01531792 0.01843786
|
|
0.02614975 0.01983738 0.01549149 0.01612997]
|
|
|
|
mean value: 0.017959117889404297
|
|
|
|
key: test_mcc
|
|
value: [0.94160273 0.87631485 0.92787101 0.8502811 0.86849267 0.92898531
|
|
0.82928843 0.92737353 0.82675403 0.85628096]
|
|
|
|
mean value: 0.8833244627512812
|
|
|
|
key: train_mcc
|
|
value: [0.93540003 0.96791376 0.94666567 0.94883081 0.93851413 0.95851448
|
|
0.95806651 0.95199397 0.90417865 0.92389522]
|
|
|
|
mean value: 0.9433973227538303
|
|
|
|
key: test_accuracy
|
|
value: [0.97080292 0.93430657 0.96350365 0.91970803 0.93382353 0.96323529
|
|
0.91176471 0.96323529 0.91176471 0.92647059]
|
|
|
|
mean value: 0.9398615285530271
|
|
|
|
key: train_accuracy
|
|
value: [0.96740016 0.98370008 0.97310513 0.97392013 0.96905537 0.97882736
|
|
0.97882736 0.97557003 0.9519544 0.96172638]
|
|
|
|
mean value: 0.971408642142457
|
|
|
|
key: test_fscore
|
|
value: [0.97058824 0.93793103 0.96453901 0.9261745 0.9352518 0.96453901
|
|
0.91666667 0.96402878 0.91549296 0.92957746]
|
|
|
|
mean value: 0.9424789445347015
|
|
|
|
key: train_fscore
|
|
value: [0.968 0.98397436 0.97349398 0.97448166 0.96950241 0.97926635
|
|
0.97913323 0.97607656 0.95253419 0.96230954]
|
|
|
|
mean value: 0.9718772264685587
|
|
|
|
key: test_precision
|
|
value: [0.97058824 0.88311688 0.94444444 0.8625 0.91549296 0.93150685
|
|
0.86842105 0.94366197 0.87837838 0.89189189]
|
|
|
|
mean value: 0.9090002664649828
|
|
|
|
key: train_precision
|
|
value: [0.95125786 0.96845426 0.95886076 0.95319813 0.9556962 0.959375
|
|
0.96518987 0.95625 0.94117647 0.9478673 ]
|
|
|
|
mean value: 0.9557325852844888
|
|
|
|
key: test_recall
|
|
value: [0.97058824 1. 0.98550725 1. 0.95588235 1.
|
|
0.97058824 0.98529412 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9794330775788577
|
|
|
|
key: train_recall
|
|
value: [0.98534202 1. 0.98858075 0.99673736 0.98371336 1.
|
|
0.99348534 0.99674267 0.96416938 0.9771987 ]
|
|
|
|
mean value: 0.9885969573465256
|
|
|
|
key: test_roc_auc
|
|
value: [0.97080136 0.93478261 0.96334186 0.91911765 0.93382353 0.96323529
|
|
0.91176471 0.96323529 0.91176471 0.92647059]
|
|
|
|
mean value: 0.9398337595907928
|
|
|
|
key: train_roc_auc
|
|
value: [0.96738553 0.98368679 0.97311774 0.97393871 0.96905537 0.97882736
|
|
0.97882736 0.97557003 0.9519544 0.96172638]
|
|
|
|
mean value: 0.9714089674851614
|
|
|
|
key: test_jcc
|
|
value: [0.94285714 0.88311688 0.93150685 0.8625 0.87837838 0.93150685
|
|
0.84615385 0.93055556 0.84415584 0.86842105]
|
|
|
|
mean value: 0.8919152401479367
|
|
|
|
key: train_jcc
|
|
value: [0.9379845 0.96845426 0.94835681 0.95023328 0.94080997 0.959375
|
|
0.9591195 0.95327103 0.9093702 0.92735703]
|
|
|
|
mean value: 0.9454331569694207
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0196569 0.01368809 0.01368642 0.01357722 0.01296353 0.01416087
|
|
0.01404524 0.01523566 0.01527238 0.01575828]
|
|
|
|
mean value: 0.014804458618164063
|
|
|
|
key: score_time
|
|
value: [0.01393223 0.0100534 0.00978374 0.00978899 0.00955415 0.0099647
|
|
0.01073837 0.0102942 0.01072192 0.01070642]
|
|
|
|
mean value: 0.010553812980651856
|
|
|
|
key: test_mcc
|
|
value: [0.64295346 0.51887407 0.71597934 0.63063055 0.66356093 0.60352881
|
|
0.69305253 0.68120121 0.54559454 0.69731096]
|
|
|
|
mean value: 0.6392686404912149
|
|
|
|
key: train_mcc
|
|
value: [0.6456698 0.66922404 0.66332001 0.64476831 0.62687987 0.63075937
|
|
0.66776333 0.62667844 0.67204092 0.63837522]
|
|
|
|
mean value: 0.6485479305996751
|
|
|
|
key: test_accuracy
|
|
value: [0.81751825 0.75912409 0.8540146 0.81021898 0.83088235 0.80147059
|
|
0.84558824 0.83823529 0.77205882 0.84558824]
|
|
|
|
mean value: 0.8174699441820524
|
|
|
|
key: train_accuracy
|
|
value: [0.82233089 0.83374083 0.83129584 0.8190709 0.81188925 0.81433225
|
|
0.83306189 0.81188925 0.83550489 0.81840391]
|
|
|
|
mean value: 0.8231519901032417
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.76258993 0.84375 0.79365079 0.82442748 0.8057554
|
|
0.83969466 0.828125 0.76335878 0.83464567]
|
|
|
|
mean value: 0.8095997702713673
|
|
|
|
key: train_fscore
|
|
value: [0.81742044 0.8277027 0.82706767 0.80492091 0.80205656 0.80645161
|
|
0.82700422 0.80239521 0.83082077 0.81181435]
|
|
|
|
mean value: 0.8157654434944623
|
|
|
|
key: test_precision
|
|
value: [0.87719298 0.74647887 0.91525424 0.87719298 0.85714286 0.78873239
|
|
0.87301587 0.88333333 0.79365079 0.89830508]
|
|
|
|
mean value: 0.851029941169467
|
|
|
|
key: train_precision
|
|
value: [0.84137931 0.85964912 0.84760274 0.87238095 0.84629295 0.84219858
|
|
0.85814361 0.84504505 0.85517241 0.84238179]
|
|
|
|
mean value: 0.8510246507261562
|
|
|
|
key: test_recall
|
|
value: [0.73529412 0.77941176 0.7826087 0.72463768 0.79411765 0.82352941
|
|
0.80882353 0.77941176 0.73529412 0.77941176]
|
|
|
|
mean value: 0.7742540494458653
|
|
|
|
key: train_recall
|
|
value: [0.79478827 0.7980456 0.80750408 0.74714519 0.76221498 0.77361564
|
|
0.7980456 0.76384365 0.80781759 0.78338762]
|
|
|
|
mean value: 0.7836408223560106
|
|
|
|
key: test_roc_auc
|
|
value: [0.81692242 0.7592711 0.85453964 0.81084825 0.83088235 0.80147059
|
|
0.84558824 0.83823529 0.77205882 0.84558824]
|
|
|
|
mean value: 0.8175404944586531
|
|
|
|
key: train_roc_auc
|
|
value: [0.82235335 0.83376995 0.83127647 0.81901233 0.81188925 0.81433225
|
|
0.83306189 0.81188925 0.83550489 0.81840391]
|
|
|
|
mean value: 0.8231493535822648
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.61627907 0.72972973 0.65789474 0.7012987 0.6746988
|
|
0.72368421 0.70666667 0.61728395 0.71621622]
|
|
|
|
mean value: 0.681041874351185
|
|
|
|
key: train_jcc
|
|
value: [0.69121813 0.70605187 0.70512821 0.67352941 0.6695279 0.67567568
|
|
0.70503597 0.67 0.71060172 0.68323864]
|
|
|
|
mean value: 0.6890007519859123
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02289462 0.01865649 0.01860642 0.01872373 0.01842427 0.01855087
|
|
0.01844025 0.01858234 0.01843119 0.0383122 ]
|
|
|
|
mean value: 0.020962238311767578
|
|
|
|
key: score_time
|
|
value: [0.0129528 0.01290703 0.01294899 0.01289344 0.01297498 0.01293111
|
|
0.01290679 0.01291537 0.01290774 0.01293302]
|
|
|
|
mean value: 0.01292712688446045
|
|
|
|
key: test_mcc
|
|
value: [0.59324085 0.56235346 0.62041773 0.50373224 0.63242133 0.67911938
|
|
0.4738791 0.70710678 0.5008673 0.66240967]
|
|
|
|
mean value: 0.5935547845683921
|
|
|
|
key: train_mcc
|
|
value: [0.64145228 0.63325194 0.6040146 0.5949754 0.59609121 0.60099713
|
|
0.62396473 0.60912052 0.63687624 0.60749911]
|
|
|
|
mean value: 0.6148243154248934
|
|
|
|
key: test_accuracy
|
|
value: [0.79562044 0.7810219 0.81021898 0.75182482 0.81617647 0.83823529
|
|
0.73529412 0.85294118 0.75 0.83088235]
|
|
|
|
mean value: 0.7962215543151567
|
|
|
|
key: train_accuracy
|
|
value: [0.8207009 0.81662592 0.80195599 0.79706601 0.7980456 0.8004886
|
|
0.81188925 0.80456026 0.81840391 0.80374593]
|
|
|
|
mean value: 0.8073482368744508
|
|
|
|
key: test_fscore
|
|
value: [0.78461538 0.7826087 0.8115942 0.75714286 0.81751825 0.84507042
|
|
0.75 0.84848485 0.75714286 0.83453237]
|
|
|
|
mean value: 0.7988709890747785
|
|
|
|
key: train_fscore
|
|
value: [0.82200647 0.81692433 0.80355699 0.79128248 0.7980456 0.79967294
|
|
0.81415929 0.80456026 0.81972514 0.80422421]
|
|
|
|
mean value: 0.8074157715143396
|
|
|
|
key: test_precision
|
|
value: [0.82258065 0.77142857 0.8115942 0.74647887 0.8115942 0.81081081
|
|
0.71052632 0.875 0.73611111 0.81690141]
|
|
|
|
mean value: 0.79130261417885
|
|
|
|
key: train_precision
|
|
value: [0.81672026 0.81626016 0.79647436 0.8137931 0.7980456 0.80295567
|
|
0.80445151 0.80456026 0.81380417 0.80226904]
|
|
|
|
mean value: 0.8069334137924529
|
|
|
|
key: test_recall
|
|
value: [0.75 0.79411765 0.8115942 0.76811594 0.82352941 0.88235294
|
|
0.79411765 0.82352941 0.77941176 0.85294118]
|
|
|
|
mean value: 0.8079710144927537
|
|
|
|
key: train_recall
|
|
value: [0.82736156 0.81758958 0.81076672 0.76998369 0.7980456 0.79641694
|
|
0.82410423 0.80456026 0.8257329 0.80618893]
|
|
|
|
mean value: 0.8080750407830343
|
|
|
|
key: test_roc_auc
|
|
value: [0.79528986 0.78111679 0.81020887 0.75170503 0.81617647 0.83823529
|
|
0.73529412 0.85294118 0.75 0.83088235]
|
|
|
|
mean value: 0.7961849957374254
|
|
|
|
key: train_roc_auc
|
|
value: [0.82069546 0.81662513 0.80196317 0.79704396 0.7980456 0.8004886
|
|
0.81188925 0.80456026 0.81840391 0.80374593]
|
|
|
|
mean value: 0.8073461270730269
|
|
|
|
key: test_jcc
|
|
value: [0.64556962 0.64285714 0.68292683 0.6091954 0.69135802 0.73170732
|
|
0.6 0.73684211 0.6091954 0.71604938]
|
|
|
|
mean value: 0.6665701226720038
|
|
|
|
key: train_jcc
|
|
value: [0.6978022 0.69050894 0.67162162 0.65464632 0.66395664 0.66621253
|
|
0.68656716 0.67302452 0.69452055 0.67255435]
|
|
|
|
mean value: 0.6771414841563377
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01670122 0.01258683 0.01203728 0.01207185 0.0117681 0.01326847
|
|
0.01234245 0.01252675 0.01242781 0.01216888]
|
|
|
|
mean value: 0.01278996467590332
|
|
|
|
key: score_time
|
|
value: [0.04255915 0.02082086 0.02272153 0.02175403 0.02360296 0.02218795
|
|
0.02258205 0.02123189 0.02138758 0.02124214]
|
|
|
|
mean value: 0.024009013175964357
|
|
|
|
key: test_mcc
|
|
value: [0.81460896 0.73858362 0.80402464 0.67267776 0.72698376 0.77459667
|
|
0.76409318 0.79967098 0.75653442 0.8722811 ]
|
|
|
|
mean value: 0.7724055093528688
|
|
|
|
key: train_mcc
|
|
value: [0.85191645 0.85040708 0.85300846 0.85379503 0.85346882 0.87206933
|
|
0.85100719 0.85930172 0.86550007 0.8553372 ]
|
|
|
|
mean value: 0.8565811351622082
|
|
|
|
key: test_accuracy
|
|
value: [0.90510949 0.86861314 0.89781022 0.82481752 0.86029412 0.875
|
|
0.875 0.89705882 0.86764706 0.93382353]
|
|
|
|
mean value: 0.8805173894375269
|
|
|
|
key: train_accuracy
|
|
value: [0.92176039 0.92176039 0.92257539 0.92257539 0.92345277 0.93241042
|
|
0.92100977 0.92589577 0.92915309 0.92345277]
|
|
|
|
mean value: 0.9244046149476093
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.87142857 0.90540541 0.84615385 0.86896552 0.88888889
|
|
0.88590604 0.90277778 0.88157895 0.93706294]
|
|
|
|
mean value: 0.8897258840686593
|
|
|
|
key: train_fscore
|
|
value: [0.92694064 0.92649311 0.92742552 0.92764661 0.92791411 0.93649579
|
|
0.92634776 0.93048128 0.93343535 0.92846271]
|
|
|
|
mean value: 0.9291642877686668
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.84722222 0.84810127 0.75862069 0.81818182 0.8
|
|
0.81481481 0.85526316 0.79761905 0.89333333]
|
|
|
|
mean value: 0.8299823016210597
|
|
|
|
key: train_precision
|
|
value: [0.87 0.87427746 0.87212644 0.87 0.87681159 0.88311688
|
|
0.86770982 0.87625899 0.88023088 0.87142857]
|
|
|
|
mean value: 0.8741960630292233
|
|
|
|
key: test_recall
|
|
value: [0.95588235 0.89705882 0.97101449 0.95652174 0.92647059 1.
|
|
0.97058824 0.95588235 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9604006820119353
|
|
|
|
key: train_recall
|
|
value: [0.99185668 0.98534202 0.99021207 0.99347471 0.98534202 0.99674267
|
|
0.99348534 0.99185668 0.99348534 0.99348534]
|
|
|
|
mean value: 0.9915282877502112
|
|
|
|
key: test_roc_auc
|
|
value: [0.90547741 0.86881927 0.89727195 0.8238491 0.86029412 0.875
|
|
0.875 0.89705882 0.86764706 0.93382353]
|
|
|
|
mean value: 0.880424126172208
|
|
|
|
key: train_roc_auc
|
|
value: [0.92170322 0.92170853 0.92263047 0.92263312 0.92345277 0.93241042
|
|
0.92100977 0.92589577 0.92915309 0.92345277]
|
|
|
|
mean value: 0.9244049927998682
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.7721519 0.82716049 0.73333333 0.76829268 0.8
|
|
0.79518072 0.82278481 0.78823529 0.88157895]
|
|
|
|
mean value: 0.802205151665905
|
|
|
|
key: train_jcc
|
|
value: [0.86382979 0.86305278 0.86467236 0.86505682 0.86552217 0.88057554
|
|
0.86280057 0.87 0.87517934 0.86647727]
|
|
|
|
mean value: 0.8677166644458821
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07805991 0.07112241 0.07135248 0.0707829 0.07207608 0.07116532
|
|
0.07046342 0.07161427 0.06986785 0.07115483]
|
|
|
|
mean value: 0.07176594734191895
|
|
|
|
key: score_time
|
|
value: [0.02446485 0.02405882 0.02418971 0.02379894 0.02402472 0.02377605
|
|
0.02382326 0.02369189 0.02512693 0.02385998]
|
|
|
|
mean value: 0.02408151626586914
|
|
|
|
key: test_mcc
|
|
value: [0.67983923 0.75261265 0.87099729 0.82788248 0.79411765 0.82495791
|
|
0.72254413 0.79446135 0.69305253 0.83905224]
|
|
|
|
mean value: 0.7799517454933128
|
|
|
|
key: train_mcc
|
|
value: [0.81255164 0.83715827 0.83586511 0.84556016 0.82757672 0.84536769
|
|
0.83387622 0.83559466 0.84697743 0.82578657]
|
|
|
|
mean value: 0.8346314474250649
|
|
|
|
key: test_accuracy
|
|
value: [0.83941606 0.87591241 0.93430657 0.91240876 0.89705882 0.91176471
|
|
0.86029412 0.89705882 0.84558824 0.91911765]
|
|
|
|
mean value: 0.8892926148561614
|
|
|
|
key: train_accuracy
|
|
value: [0.90627547 0.91850041 0.91768541 0.92257539 0.91368078 0.92263844
|
|
0.91693811 0.91775244 0.92345277 0.91286645]
|
|
|
|
mean value: 0.9172365665044638
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.87769784 0.93233083 0.91666667 0.89705882 0.91428571
|
|
0.86524823 0.89552239 0.85106383 0.92086331]
|
|
|
|
mean value: 0.8904070960759222
|
|
|
|
key: train_fscore
|
|
value: [0.90642799 0.91935484 0.91900561 0.92369478 0.91465378 0.92320129
|
|
0.91693811 0.91835085 0.92394822 0.91336032]
|
|
|
|
mean value: 0.9178935802733702
|
|
|
|
key: test_precision
|
|
value: [0.859375 0.85915493 0.96875 0.88 0.89705882 0.88888889
|
|
0.83561644 0.90909091 0.82191781 0.90140845]
|
|
|
|
mean value: 0.8821261248366242
|
|
|
|
key: train_precision
|
|
value: [0.90569106 0.91054313 0.90378549 0.90981013 0.9044586 0.91653291
|
|
0.91693811 0.9117175 0.91800643 0.90821256]
|
|
|
|
mean value: 0.9105695905456304
|
|
|
|
key: test_recall
|
|
value: [0.80882353 0.89705882 0.89855072 0.95652174 0.89705882 0.94117647
|
|
0.89705882 0.88235294 0.88235294 0.94117647]
|
|
|
|
mean value: 0.9002131287297528
|
|
|
|
key: train_recall
|
|
value: [0.90716612 0.92833876 0.93474715 0.93800979 0.92508143 0.92996743
|
|
0.91693811 0.92508143 0.92996743 0.91856678]
|
|
|
|
mean value: 0.9253864424972501
|
|
|
|
key: test_roc_auc
|
|
value: [0.83919437 0.87606564 0.93456948 0.9120844 0.89705882 0.91176471
|
|
0.86029412 0.89705882 0.84558824 0.91911765]
|
|
|
|
mean value: 0.8892796248934356
|
|
|
|
key: train_roc_auc
|
|
value: [0.90627474 0.91849238 0.91769931 0.92258796 0.91368078 0.92263844
|
|
0.91693811 0.91775244 0.92345277 0.91286645]
|
|
|
|
mean value: 0.9172383376463273
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.78205128 0.87323944 0.84615385 0.81333333 0.84210526
|
|
0.7625 0.81081081 0.74074074 0.85333333]
|
|
|
|
mean value: 0.8038553760486674
|
|
|
|
key: train_jcc
|
|
value: [0.82886905 0.85074627 0.85014837 0.85820896 0.84272997 0.85735736
|
|
0.84661654 0.8490284 0.85864662 0.84053651]
|
|
|
|
mean value: 0.8482888038296238
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [5.95455098 2.9566772 5.66405582 4.63643026 3.77995276 8.58176708
|
|
6.5036099 2.79477501 8.09094572 7.14753866]
|
|
|
|
mean value: 5.611030340194702
|
|
|
|
key: score_time
|
|
value: [0.01393247 0.02110529 0.01480699 0.01343775 0.03119874 0.01591206
|
|
0.01321292 0.02379489 0.02104306 0.01340723]
|
|
|
|
mean value: 0.018185138702392578
|
|
|
|
key: test_mcc
|
|
value: [0.98550725 0.84393916 0.97120941 0.9158731 0.94280904 0.91533482
|
|
0.88580789 0.90184995 0.95681396 0.97100831]
|
|
|
|
mean value: 0.9290152902280305
|
|
|
|
key: train_mcc
|
|
value: [0.9967453 0.96446746 0.99674532 0.99188303 0.97427222 0.99837266
|
|
0.98047683 0.89328059 0.99674796 0.99188957]
|
|
|
|
mean value: 0.9784880937538125
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.91970803 0.98540146 0.95620438 0.97058824 0.95588235
|
|
0.94117647 0.94852941 0.97794118 0.98529412]
|
|
|
|
mean value: 0.9633426363246028
|
|
|
|
key: train_accuracy
|
|
value: [0.99837001 0.98207009 0.99837001 0.99592502 0.98697068 0.99918567
|
|
0.99022801 0.94381107 0.99837134 0.99592834]
|
|
|
|
mean value: 0.9889230240330883
|
|
|
|
key: test_fscore
|
|
value: [0.99270073 0.92307692 0.98571429 0.95833333 0.97142857 0.95774648
|
|
0.94366197 0.95104895 0.97841727 0.98550725]
|
|
|
|
mean value: 0.9647635757797159
|
|
|
|
key: train_fscore
|
|
value: [0.99837398 0.98231511 0.99837134 0.99593826 0.98713826 0.99918633
|
|
0.99025974 0.94680031 0.99837398 0.99594485]
|
|
|
|
mean value: 0.9892702169739379
|
|
|
|
key: test_precision
|
|
value: [0.98550725 0.88 0.97183099 0.92 0.94444444 0.91891892
|
|
0.90540541 0.90666667 0.95774648 0.97142857]
|
|
|
|
mean value: 0.9361948718029551
|
|
|
|
key: train_precision
|
|
value: [0.99675325 0.96984127 0.99674797 0.99190939 0.97460317 0.99837398
|
|
0.98705502 0.89897511 0.99675325 0.99192246]
|
|
|
|
mean value: 0.9802934855848118
|
|
|
|
key: test_recall
|
|
value: [1. 0.97058824 1. 1. 1. 1.
|
|
0.98529412 1. 1. 1. ]
|
|
|
|
mean value: 0.9955882352941177
|
|
|
|
key: train_recall
|
|
value: [1. 0.99511401 1. 1. 1. 1.
|
|
0.99348534 1. 1. 1. ]
|
|
|
|
mean value: 0.9988599348534202
|
|
|
|
key: test_roc_auc
|
|
value: [0.99275362 0.92007673 0.98529412 0.95588235 0.97058824 0.95588235
|
|
0.94117647 0.94852941 0.97794118 0.98529412]
|
|
|
|
mean value: 0.9633418584825234
|
|
|
|
key: train_roc_auc
|
|
value: [0.99836868 0.98205945 0.99837134 0.99592834 0.98697068 0.99918567
|
|
0.99022801 0.94381107 0.99837134 0.99592834]
|
|
|
|
mean value: 0.988922291714269
|
|
|
|
key: test_jcc
|
|
value: [0.98550725 0.85714286 0.97183099 0.92 0.94444444 0.91891892
|
|
0.89333333 0.90666667 0.95774648 0.97142857]
|
|
|
|
mean value: 0.9327019503100336
|
|
|
|
key: train_jcc
|
|
value: [0.99675325 0.96524487 0.99674797 0.99190939 0.97460317 0.99837398
|
|
0.9807074 0.89897511 0.99675325 0.99192246]
|
|
|
|
mean value: 0.9791990831042809
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07329679 0.0687356 0.0710535 0.07005835 0.07708025 0.08290029
|
|
0.06846547 0.07136536 0.06657863 0.074476 ]
|
|
|
|
mean value: 0.07240102291107178
|
|
|
|
key: score_time
|
|
value: [0.01113582 0.01123357 0.01132107 0.01113486 0.01153493 0.0116272
|
|
0.01152778 0.01141524 0.01147556 0.01147532]
|
|
|
|
mean value: 0.011388134956359864
|
|
|
|
key: test_mcc
|
|
value: [0.91597649 0.92951942 1. 0.94318882 0.94280904 0.95681396
|
|
0.88852332 0.95681396 0.88852332 0.95681396]
|
|
|
|
mean value: 0.9378982292461375
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95620438 0.96350365 1. 0.97080292 0.97058824 0.97794118
|
|
0.94117647 0.97794118 0.94117647 0.97794118]
|
|
|
|
mean value: 0.9677275654787463
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95774648 0.96453901 1. 0.97183099 0.97142857 0.97841727
|
|
0.94444444 0.97841727 0.94444444 0.97841727]
|
|
|
|
mean value: 0.9689685730759542
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.91891892 0.93150685 1. 0.94520548 0.94444444 0.95774648
|
|
0.89473684 0.95774648 0.89473684 0.95774648]
|
|
|
|
mean value: 0.9402788812960731
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95652174 0.96376812 1. 0.97058824 0.97058824 0.97794118
|
|
0.94117647 0.97794118 0.94117647 0.97794118]
|
|
|
|
mean value: 0.9677642796248934
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.91891892 0.93150685 1. 0.94520548 0.94444444 0.95774648
|
|
0.89473684 0.95774648 0.89473684 0.95774648]
|
|
|
|
mean value: 0.9402788812960731
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.23720622 0.2307353 0.23288369 0.236022 0.22834826 0.2347486
|
|
0.2318604 0.23015618 0.22972083 0.23117995]
|
|
|
|
mean value: 0.23228614330291747
|
|
|
|
key: score_time
|
|
value: [0.02257872 0.02273583 0.02290273 0.02357268 0.02258182 0.02260256
|
|
0.02258444 0.02248144 0.02255177 0.02253366]
|
|
|
|
mean value: 0.022712564468383788
|
|
|
|
key: test_mcc
|
|
value: [1. 0.94323594 1. 0.98550418 0.98540068 1.
|
|
0.98540068 1. 0.98540068 1. ]
|
|
|
|
mean value: 0.9884942149819166
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97080292 1. 0.99270073 0.99264706 1.
|
|
0.99264706 1. 0.99264706 1. ]
|
|
|
|
mean value: 0.9941444826105625
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.97142857 1. 0.99280576 0.99270073 1.
|
|
0.99270073 1. 0.99270073 1. ]
|
|
|
|
mean value: 0.9942336516605277
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.94444444 1. 0.98571429 0.98550725 1.
|
|
0.98550725 1. 0.98550725 1. ]
|
|
|
|
mean value: 0.9886680469289165
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.97101449 1. 0.99264706 0.99264706 1.
|
|
0.99264706 1. 0.99264706 1. ]
|
|
|
|
mean value: 0.994160272804774
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.94444444 1. 0.98571429 0.98550725 1.
|
|
0.98550725 1. 0.98550725 1. ]
|
|
|
|
mean value: 0.9886680469289165
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01662683 0.01681709 0.01678419 0.0168705 0.01649547 0.01645446
|
|
0.0163095 0.0163238 0.01661205 0.01642942]
|
|
|
|
mean value: 0.016572332382202147
|
|
|
|
key: score_time
|
|
value: [0.01098466 0.01094747 0.01095343 0.01092076 0.01088595 0.01097775
|
|
0.01093483 0.01091647 0.01096272 0.01096725]
|
|
|
|
mean value: 0.01094512939453125
|
|
|
|
key: test_mcc
|
|
value: [0.90259957 0.92951942 0.90246052 0.9158731 0.8753478 0.88852332
|
|
0.8623165 0.90184995 0.91533482 0.94280904]
|
|
|
|
mean value: 0.9036634035137634
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.96350365 0.94890511 0.95620438 0.93382353 0.94117647
|
|
0.92647059 0.94852941 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9493988836410476
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95104895 0.96453901 0.95172414 0.95833333 0.93793103 0.94444444
|
|
0.93150685 0.95104895 0.95774648 0.97142857]
|
|
|
|
mean value: 0.951975175899855
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90666667 0.93150685 0.90789474 0.92 0.88311688 0.89473684
|
|
0.87179487 0.90666667 0.91891892 0.94444444]
|
|
|
|
mean value: 0.9085746879870888
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94927536 0.96376812 0.94852941 0.95588235 0.93382353 0.94117647
|
|
0.92647059 0.94852941 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9493925831202046
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90666667 0.93150685 0.90789474 0.92 0.88311688 0.89473684
|
|
0.87179487 0.90666667 0.91891892 0.94444444]
|
|
|
|
mean value: 0.9085746879870888
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [3.64647102 3.58429003 3.61780548 3.60837865 3.61581659 3.61313367
|
|
3.7440722 4.07526445 3.05344391 3.19134378]
|
|
|
|
mean value: 3.5750019788742065
|
|
|
|
key: score_time
|
|
value: [0.12135196 0.12102342 0.12090707 0.12046289 0.1204288 0.11982846
|
|
0.24435544 0.11087871 0.10419559 0.11555219]
|
|
|
|
mean value: 0.12989845275878906
|
|
|
|
key: test_mcc
|
|
value: [1. 0.95713391 1. 0.98550418 0.95681396 1.
|
|
0.95681396 0.97100831 0.91533482 0.98540068]
|
|
|
|
mean value: 0.972800982776614
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97810219 1. 0.99270073 0.97794118 1.
|
|
0.97794118 0.98529412 0.95588235 0.99264706]
|
|
|
|
mean value: 0.986050880206097
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.97841727 1. 0.99280576 0.97841727 1.
|
|
0.97841727 0.98550725 0.95774648 0.99270073]
|
|
|
|
mean value: 0.9864012009133892
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.95774648 1. 0.98571429 0.95774648 1.
|
|
0.95774648 0.97142857 0.91891892 0.98550725]
|
|
|
|
mean value: 0.9734808459058306
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.97826087 1. 0.99264706 0.97794118 1.
|
|
0.97794118 0.98529412 0.95588235 0.99264706]
|
|
|
|
mean value: 0.9860613810741689
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.95774648 1. 0.98571429 0.95774648 1.
|
|
0.95774648 0.97142857 0.91891892 0.98550725]
|
|
|
|
mean value: 0.9734808459058306
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.45984983 2.94581032 2.88509512 2.17885375 1.31969047 1.28217196
|
|
1.37900114 1.31501865 1.3011024 1.29474759]
|
|
|
|
mean value: 1.836134123802185
|
|
|
|
key: score_time
|
|
value: [0.16247296 0.25455213 0.21584201 0.22554564 0.26957107 0.22703886
|
|
0.18415976 0.24923444 0.21996379 0.1540041 ]
|
|
|
|
mean value: 0.21623847484588624
|
|
|
|
key: test_mcc
|
|
value: [0.98550418 0.91281179 0.98550725 0.95710706 0.91176471 0.98540068
|
|
0.95681396 0.95598573 0.90184995 0.95681396]
|
|
|
|
mean value: 0.950955925944105
|
|
|
|
key: train_mcc
|
|
value: [0.98536269 0.98533135 0.98536289 0.98372107 0.98046123 0.98373423
|
|
0.99188957 0.98376033 0.98697592 0.98373423]
|
|
|
|
mean value: 0.9850333498563252
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.95620438 0.99270073 0.97810219 0.95588235 0.99264706
|
|
0.97794118 0.97794118 0.94852941 0.97794118]
|
|
|
|
mean value: 0.9750590382138257
|
|
|
|
key: train_accuracy
|
|
value: [0.99266504 0.99266504 0.99266504 0.99185004 0.99022801 0.99185668
|
|
0.99592834 0.99185668 0.99348534 0.99185668]
|
|
|
|
mean value: 0.9925056877158611
|
|
|
|
key: test_fscore
|
|
value: [0.99259259 0.95652174 0.99270073 0.9787234 0.95588235 0.99270073
|
|
0.97841727 0.97810219 0.95104895 0.97841727]
|
|
|
|
mean value: 0.9755107221977611
|
|
|
|
key: train_fscore
|
|
value: [0.99270073 0.99267697 0.99268887 0.99186992 0.9902439 0.99188312
|
|
0.99594485 0.99189627 0.99349593 0.99188312]
|
|
|
|
mean value: 0.9925283686021121
|
|
|
|
key: test_precision
|
|
value: [1. 0.94285714 1. 0.95833333 0.95588235 0.98550725
|
|
0.95774648 0.97101449 0.90666667 0.95774648]
|
|
|
|
mean value: 0.9635754192675233
|
|
|
|
key: train_precision
|
|
value: [0.98869144 0.99186992 0.98867314 0.98865478 0.98863636 0.98867314
|
|
0.99192246 0.98709677 0.99188312 0.98867314]
|
|
|
|
mean value: 0.9894774265463709
|
|
|
|
key: test_recall
|
|
value: [0.98529412 0.97058824 0.98550725 1. 0.95588235 1.
|
|
1. 0.98529412 1. 1. ]
|
|
|
|
mean value: 0.9882566069906223
|
|
|
|
key: train_recall
|
|
value: [0.99674267 0.99348534 0.99673736 0.99510604 0.99185668 0.99511401
|
|
1. 0.99674267 0.99511401 0.99511401]
|
|
|
|
mean value: 0.9956012774255942
|
|
|
|
key: test_roc_auc
|
|
value: [0.99264706 0.95630861 0.99275362 0.97794118 0.95588235 0.99264706
|
|
0.97794118 0.97794118 0.94852941 0.97794118]
|
|
|
|
mean value: 0.9750532821824383
|
|
|
|
key: train_roc_auc
|
|
value: [0.99266171 0.99266437 0.99266835 0.99185269 0.99022801 0.99185668
|
|
0.99592834 0.99185668 0.99348534 0.99185668]
|
|
|
|
mean value: 0.9925058849785591
|
|
|
|
key: test_jcc
|
|
value: [0.98529412 0.91666667 0.98550725 0.95833333 0.91549296 0.98550725
|
|
0.95774648 0.95714286 0.90666667 0.95774648]
|
|
|
|
mean value: 0.9526104049703163
|
|
|
|
key: train_jcc
|
|
value: [0.98550725 0.98546042 0.98548387 0.98387097 0.98067633 0.98389694
|
|
0.99192246 0.98392283 0.98707593 0.98389694]
|
|
|
|
mean value: 0.9851713928531682
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02761936 0.01877713 0.01874304 0.01867342 0.01853132 0.0185585
|
|
0.03597736 0.01850724 0.01871991 0.01877713]
|
|
|
|
mean value: 0.021288442611694335
|
|
|
|
key: score_time
|
|
value: [0.01272559 0.01290345 0.01310086 0.01296473 0.01297116 0.01303959
|
|
0.02226686 0.01296878 0.0130465 0.0130434 ]
|
|
|
|
mean value: 0.013903093338012696
|
|
|
|
key: test_mcc
|
|
value: [0.59324085 0.56235346 0.62041773 0.50373224 0.63242133 0.67911938
|
|
0.4738791 0.70710678 0.5008673 0.66240967]
|
|
|
|
mean value: 0.5935547845683921
|
|
|
|
key: train_mcc
|
|
value: [0.64145228 0.63325194 0.6040146 0.5949754 0.59609121 0.60099713
|
|
0.62396473 0.60912052 0.63687624 0.60749911]
|
|
|
|
mean value: 0.6148243154248934
|
|
|
|
key: test_accuracy
|
|
value: [0.79562044 0.7810219 0.81021898 0.75182482 0.81617647 0.83823529
|
|
0.73529412 0.85294118 0.75 0.83088235]
|
|
|
|
mean value: 0.7962215543151567
|
|
|
|
key: train_accuracy
|
|
value: [0.8207009 0.81662592 0.80195599 0.79706601 0.7980456 0.8004886
|
|
0.81188925 0.80456026 0.81840391 0.80374593]
|
|
|
|
mean value: 0.8073482368744508
|
|
|
|
key: test_fscore
|
|
value: [0.78461538 0.7826087 0.8115942 0.75714286 0.81751825 0.84507042
|
|
0.75 0.84848485 0.75714286 0.83453237]
|
|
|
|
mean value: 0.7988709890747785
|
|
|
|
key: train_fscore
|
|
value: [0.82200647 0.81692433 0.80355699 0.79128248 0.7980456 0.79967294
|
|
0.81415929 0.80456026 0.81972514 0.80422421]
|
|
|
|
mean value: 0.8074157715143396
|
|
|
|
key: test_precision
|
|
value: [0.82258065 0.77142857 0.8115942 0.74647887 0.8115942 0.81081081
|
|
0.71052632 0.875 0.73611111 0.81690141]
|
|
|
|
mean value: 0.79130261417885
|
|
|
|
key: train_precision
|
|
value: [0.81672026 0.81626016 0.79647436 0.8137931 0.7980456 0.80295567
|
|
0.80445151 0.80456026 0.81380417 0.80226904]
|
|
|
|
mean value: 0.8069334137924529
|
|
|
|
key: test_recall
|
|
value: [0.75 0.79411765 0.8115942 0.76811594 0.82352941 0.88235294
|
|
0.79411765 0.82352941 0.77941176 0.85294118]
|
|
|
|
mean value: 0.8079710144927537
|
|
|
|
key: train_recall
|
|
value: [0.82736156 0.81758958 0.81076672 0.76998369 0.7980456 0.79641694
|
|
0.82410423 0.80456026 0.8257329 0.80618893]
|
|
|
|
mean value: 0.8080750407830343
|
|
|
|
key: test_roc_auc
|
|
value: [0.79528986 0.78111679 0.81020887 0.75170503 0.81617647 0.83823529
|
|
0.73529412 0.85294118 0.75 0.83088235]
|
|
|
|
mean value: 0.7961849957374254
|
|
|
|
key: train_roc_auc
|
|
value: [0.82069546 0.81662513 0.80196317 0.79704396 0.7980456 0.8004886
|
|
0.81188925 0.80456026 0.81840391 0.80374593]
|
|
|
|
mean value: 0.8073461270730269
|
|
|
|
key: test_jcc
|
|
value: [0.64556962 0.64285714 0.68292683 0.6091954 0.69135802 0.73170732
|
|
0.6 0.73684211 0.6091954 0.71604938]
|
|
|
|
mean value: 0.6665701226720038
|
|
|
|
key: train_jcc
|
|
value: [0.6978022 0.69050894 0.67162162 0.65464632 0.66395664 0.66621253
|
|
0.68656716 0.67302452 0.69452055 0.67255435]
|
|
|
|
mean value: 0.6771414841563377
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [2.5532043 2.48869276 2.57855439 5.6198473 9.45530009 7.02551937
|
|
7.66338468 6.32294011 7.51284075 7.98216033]
|
|
|
|
mean value: 5.920244407653809
|
|
|
|
key: score_time
|
|
value: [0.01291919 0.01340175 0.01387787 0.03123879 0.04060364 0.01863265
|
|
0.01957917 0.0157547 0.02303362 0.03177905]
|
|
|
|
mean value: 0.022082042694091798
|
|
|
|
key: test_mcc
|
|
value: [1. 0.95713391 0.97120941 0.92944673 0.97100831 0.98540068
|
|
0.94280904 0.94280904 0.91533482 0.94280904]
|
|
|
|
mean value: 0.9557960991328401
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97810219 0.98540146 0.96350365 0.98529412 0.99264706
|
|
0.97058824 0.97058824 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9772595534564191
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.97841727 0.98571429 0.96503497 0.98550725 0.99270073
|
|
0.97142857 0.97142857 0.95774648 0.97142857]
|
|
|
|
mean value: 0.9779406686399074
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.95774648 0.97183099 0.93243243 0.97142857 0.98550725
|
|
0.94444444 0.94444444 0.91891892 0.94444444]
|
|
|
|
mean value: 0.9571197967278801
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.97826087 0.98529412 0.96323529 0.98529412 0.99264706
|
|
0.97058824 0.97058824 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9772378516624041
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.95774648 0.97183099 0.93243243 0.97142857 0.98550725
|
|
0.94444444 0.94444444 0.91891892 0.94444444]
|
|
|
|
mean value: 0.9571197967278801
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06341195 0.12838125 0.12218761 0.11328745 0.0931344 0.10351372
|
|
0.09466314 0.09430265 0.10396051 0.13386202]
|
|
|
|
mean value: 0.10507047176361084
|
|
|
|
key: score_time
|
|
value: [0.02171898 0.01990557 0.02713132 0.02102256 0.02411985 0.02375531
|
|
0.02291679 0.02295637 0.02355218 0.02601242]
|
|
|
|
mean value: 0.02330913543701172
|
|
|
|
key: test_mcc
|
|
value: [0.87086187 0.84393916 0.8251972 0.81433714 0.75008111 0.88388348
|
|
0.71364124 0.91176471 0.8131434 0.82675403]
|
|
|
|
mean value: 0.8253603346628763
|
|
|
|
key: train_mcc
|
|
value: [0.8599829 0.88152381 0.86490711 0.87027085 0.87543967 0.8684941
|
|
0.88152966 0.86385236 0.87175299 0.86206788]
|
|
|
|
mean value: 0.8699821323180184
|
|
|
|
key: test_accuracy
|
|
value: [0.93430657 0.91970803 0.91240876 0.90510949 0.875 0.94117647
|
|
0.85294118 0.95588235 0.90441176 0.91176471]
|
|
|
|
mean value: 0.9112709317303563
|
|
|
|
key: train_accuracy
|
|
value: [0.92991035 0.9405053 0.93235534 0.93480033 0.93729642 0.93403909
|
|
0.94055375 0.93159609 0.93566775 0.93078176]
|
|
|
|
mean value: 0.9347506165563635
|
|
|
|
key: test_fscore
|
|
value: [0.93129771 0.92307692 0.91176471 0.91034483 0.87591241 0.94285714
|
|
0.8630137 0.95588235 0.90909091 0.91549296]
|
|
|
|
mean value: 0.9138733636494115
|
|
|
|
key: train_fscore
|
|
value: [0.93064516 0.94155324 0.93301049 0.936 0.93864542 0.93504411
|
|
0.9414595 0.93290735 0.93664796 0.93194556]
|
|
|
|
mean value: 0.9357858782984593
|
|
|
|
key: test_precision
|
|
value: [0.96825397 0.88 0.92537313 0.86842105 0.86956522 0.91666667
|
|
0.80769231 0.95588235 0.86666667 0.87837838]
|
|
|
|
mean value: 0.8936899744950406
|
|
|
|
key: train_precision
|
|
value: [0.92172524 0.92598425 0.92332268 0.91836735 0.91887676 0.92101106
|
|
0.92733017 0.9153605 0.92259084 0.91653543]
|
|
|
|
mean value: 0.9211104281448699
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.97058824 0.89855072 0.95652174 0.88235294 0.97058824
|
|
0.92647059 0.95588235 0.95588235 0.95588235]
|
|
|
|
mean value: 0.9369778346121057
|
|
|
|
key: train_recall
|
|
value: [0.93973941 0.95765472 0.94290375 0.954323 0.95928339 0.9495114
|
|
0.95602606 0.95114007 0.95114007 0.94788274]
|
|
|
|
mean value: 0.9509604603833339
|
|
|
|
key: test_roc_auc
|
|
value: [0.93403666 0.92007673 0.91251066 0.90473146 0.875 0.94117647
|
|
0.85294118 0.95588235 0.90441176 0.91176471]
|
|
|
|
mean value: 0.9112531969309463
|
|
|
|
key: train_roc_auc
|
|
value: [0.92990233 0.94049131 0.93236393 0.93481622 0.93729642 0.93403909
|
|
0.94055375 0.93159609 0.93566775 0.93078176]
|
|
|
|
mean value: 0.9347508648128763
|
|
|
|
key: test_jcc
|
|
value: [0.87142857 0.85714286 0.83783784 0.83544304 0.77922078 0.89189189
|
|
0.75903614 0.91549296 0.83333333 0.84415584]
|
|
|
|
mean value: 0.8424983255310591
|
|
|
|
key: train_jcc
|
|
value: [0.87028658 0.88956127 0.87443268 0.87969925 0.88438438 0.87801205
|
|
0.88939394 0.8742515 0.88084465 0.87256372]
|
|
|
|
mean value: 0.8793430005520554
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01756454 0.01748395 0.01753092 0.0176425 0.01723886 0.01738024
|
|
0.01743436 0.01817703 0.01737046 0.01723433]
|
|
|
|
mean value: 0.017505717277526856
|
|
|
|
key: score_time
|
|
value: [0.01266932 0.01253963 0.01251888 0.01266742 0.01258802 0.01264358
|
|
0.01258373 0.01260686 0.01254725 0.01253533]
|
|
|
|
mean value: 0.01259000301361084
|
|
|
|
key: test_mcc
|
|
value: [0.62437433 0.57838662 0.63063055 0.5912191 0.70618786 0.63242133
|
|
0.63242133 0.66529914 0.51610295 0.66183628]
|
|
|
|
mean value: 0.623887949482616
|
|
|
|
key: train_mcc
|
|
value: [0.62428456 0.65029189 0.62875517 0.64013666 0.61125608 0.62447543
|
|
0.6435692 0.62899827 0.64544441 0.64994144]
|
|
|
|
mean value: 0.6347153109168008
|
|
|
|
key: test_accuracy
|
|
value: [0.81021898 0.78832117 0.81021898 0.79562044 0.85294118 0.81617647
|
|
0.81617647 0.83088235 0.75735294 0.83088235]
|
|
|
|
mean value: 0.8108791326749678
|
|
|
|
key: train_accuracy
|
|
value: [0.81173594 0.82477588 0.81418093 0.8198859 0.80537459 0.81188925
|
|
0.82166124 0.81433225 0.82247557 0.82491857]
|
|
|
|
mean value: 0.817123011290481
|
|
|
|
key: test_fscore
|
|
value: [0.796875 0.79432624 0.79365079 0.79710145 0.85074627 0.81751825
|
|
0.81481481 0.82170543 0.7480916 0.82962963]
|
|
|
|
mean value: 0.8064459474747275
|
|
|
|
key: train_fscore
|
|
value: [0.80701754 0.8206839 0.81063123 0.81659751 0.80133001 0.80733945
|
|
0.81915772 0.81125828 0.81893688 0.82333607]
|
|
|
|
mean value: 0.8136288592998409
|
|
|
|
key: test_precision
|
|
value: [0.85 0.76712329 0.87719298 0.79710145 0.86363636 0.8115942
|
|
0.82089552 0.86885246 0.77777778 0.8358209 ]
|
|
|
|
mean value: 0.8269994940642269
|
|
|
|
key: train_precision
|
|
value: [0.82847341 0.84102564 0.82571912 0.83108108 0.81833616 0.82735043
|
|
0.83082077 0.82491582 0.83559322 0.83084577]
|
|
|
|
mean value: 0.8294161432878052
|
|
|
|
key: test_recall
|
|
value: [0.75 0.82352941 0.72463768 0.79710145 0.83823529 0.82352941
|
|
0.80882353 0.77941176 0.72058824 0.82352941]
|
|
|
|
mean value: 0.7889386189258312
|
|
|
|
key: train_recall
|
|
value: [0.78664495 0.80130293 0.79608483 0.80261011 0.78501629 0.78827362
|
|
0.80781759 0.7980456 0.8029316 0.81596091]
|
|
|
|
mean value: 0.7984688428245772
|
|
|
|
key: test_roc_auc
|
|
value: [0.80978261 0.7885763 0.81084825 0.79560955 0.85294118 0.81617647
|
|
0.81617647 0.83088235 0.75735294 0.83088235]
|
|
|
|
mean value: 0.8109228473998296
|
|
|
|
key: train_roc_auc
|
|
value: [0.81175641 0.82479502 0.81416619 0.81987183 0.80537459 0.81188925
|
|
0.82166124 0.81433225 0.82247557 0.82491857]
|
|
|
|
mean value: 0.8171240920129018
|
|
|
|
key: test_jcc
|
|
value: [0.66233766 0.65882353 0.65789474 0.6626506 0.74025974 0.69135802
|
|
0.6875 0.69736842 0.59756098 0.70886076]
|
|
|
|
mean value: 0.6764614452108327
|
|
|
|
key: train_jcc
|
|
value: [0.67647059 0.69589816 0.68156425 0.69004208 0.66851595 0.67692308
|
|
0.69370629 0.68245125 0.69338959 0.69972067]
|
|
|
|
mean value: 0.6858681907721815
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06309485 0.04974627 0.0544436 0.06055784 0.03951502 0.0307796
|
|
0.04072857 0.043648 0.03793311 0.03963995]
|
|
|
|
mean value: 0.04600868225097656
|
|
|
|
key: score_time
|
|
value: [0.0201335 0.012568 0.0126636 0.01268744 0.01263571 0.01257944
|
|
0.01267719 0.01261306 0.01259804 0.01271319]
|
|
|
|
mean value: 0.013386917114257813
|
|
|
|
key: test_mcc
|
|
value: [0.86948194 0.81247516 0.67142918 0.87308606 0.72627304 0.77949606
|
|
0.69128005 0.89715584 0.72669793 0.76409318]
|
|
|
|
mean value: 0.781146844024664
|
|
|
|
key: train_mcc
|
|
value: [0.85485659 0.87290331 0.70082438 0.86315544 0.72408634 0.82740107
|
|
0.8625736 0.86858635 0.77661648 0.8250729 ]
|
|
|
|
mean value: 0.8176076462340217
|
|
|
|
key: test_accuracy
|
|
value: [0.93430657 0.90510949 0.81021898 0.93430657 0.85294118 0.88970588
|
|
0.83823529 0.94852941 0.84558824 0.875 ]
|
|
|
|
mean value: 0.8833941605839416
|
|
|
|
key: train_accuracy
|
|
value: [0.92583537 0.93643032 0.83537082 0.93154034 0.85016287 0.91368078
|
|
0.92915309 0.93403909 0.8762215 0.90798046]
|
|
|
|
mean value: 0.9040414639132016
|
|
|
|
key: test_fscore
|
|
value: [0.9352518 0.90780142 0.76785714 0.93793103 0.83333333 0.89051095
|
|
0.85333333 0.94890511 0.86624204 0.88590604]
|
|
|
|
mean value: 0.8827072197886613
|
|
|
|
key: train_fscore
|
|
value: [0.92896175 0.93617021 0.80725191 0.93192869 0.82835821 0.91325696
|
|
0.93250582 0.93514812 0.88985507 0.91432904]
|
|
|
|
mean value: 0.901776576833011
|
|
|
|
key: test_precision
|
|
value: [0.91549296 0.87671233 1. 0.89473684 0.96153846 0.88405797
|
|
0.7804878 0.94202899 0.76404494 0.81481481]
|
|
|
|
mean value: 0.8833915110192154
|
|
|
|
key: train_precision
|
|
value: [0.89205397 0.94078947 0.97241379 0.92592593 0.96943231 0.91776316
|
|
0.89037037 0.91968504 0.80156658 0.85531915]
|
|
|
|
mean value: 0.9085319776343379
|
|
|
|
key: test_recall
|
|
value: [0.95588235 0.94117647 0.62318841 0.98550725 0.73529412 0.89705882
|
|
0.94117647 0.95588235 1. 0.97058824]
|
|
|
|
mean value: 0.9005754475703325
|
|
|
|
key: train_recall
|
|
value: [0.96905537 0.93159609 0.69004894 0.93800979 0.72312704 0.90879479
|
|
0.97882736 0.95114007 1. 0.98208469]
|
|
|
|
mean value: 0.9072684134735455
|
|
|
|
key: test_roc_auc
|
|
value: [0.93446292 0.90537084 0.8115942 0.93393009 0.85294118 0.88970588
|
|
0.83823529 0.94852941 0.84558824 0.875 ]
|
|
|
|
mean value: 0.8835358056265985
|
|
|
|
key: train_roc_auc
|
|
value: [0.92580012 0.93643426 0.83525248 0.93154561 0.85016287 0.91368078
|
|
0.92915309 0.93403909 0.8762215 0.90798046]
|
|
|
|
mean value: 0.9040270257344931
|
|
|
|
key: test_jcc
|
|
value: [0.87837838 0.83116883 0.62318841 0.88311688 0.71428571 0.80263158
|
|
0.74418605 0.90277778 0.76404494 0.79518072]
|
|
|
|
mean value: 0.7938959282695474
|
|
|
|
key: train_jcc
|
|
value: [0.86734694 0.88 0.6768 0.87253414 0.70700637 0.84036145
|
|
0.87354651 0.87819549 0.80156658 0.84217877]
|
|
|
|
mean value: 0.8239536247559656
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04600525 0.0352726 0.05412579 0.04311061 0.04166174 0.03660035
|
|
0.03348541 0.0660789 0.03916955 0.04450202]
|
|
|
|
mean value: 0.044001221656799316
|
|
|
|
key: score_time
|
|
value: [0.01239896 0.01267171 0.01264596 0.01269054 0.01274228 0.01272917
|
|
0.01265216 0.01130629 0.01268458 0.01304388]
|
|
|
|
mean value: 0.01255655288696289
|
|
|
|
key: test_mcc
|
|
value: [0.68527704 0.34901614 0.84660737 0.65589003 0.81961843 0.82388584
|
|
0.78017138 0.85442069 0.76470588 0.75653442]
|
|
|
|
mean value: 0.7336127219455257
|
|
|
|
key: train_mcc
|
|
value: [0.76254001 0.4686876 0.82112674 0.64474867 0.74249993 0.87026566
|
|
0.8681346 0.87473322 0.86578708 0.83150658]
|
|
|
|
mean value: 0.7750030085751844
|
|
|
|
key: test_accuracy
|
|
value: [0.82481752 0.6350365 0.91970803 0.81021898 0.90441176 0.91176471
|
|
0.88970588 0.92647059 0.88235294 0.86764706]
|
|
|
|
mean value: 0.8572133963074281
|
|
|
|
key: train_accuracy
|
|
value: [0.87286064 0.68296659 0.90301548 0.79869601 0.86074919 0.93485342
|
|
0.93403909 0.93729642 0.93241042 0.91042345]
|
|
|
|
mean value: 0.8767310699277122
|
|
|
|
key: test_fscore
|
|
value: [0.78947368 0.45652174 0.92517007 0.77586207 0.896 0.91044776
|
|
0.89208633 0.92424242 0.88235294 0.88157895]
|
|
|
|
mean value: 0.8333735965250287
|
|
|
|
key: train_fscore
|
|
value: [0.85818182 0.53964497 0.91139241 0.75175879 0.84210526 0.93366501
|
|
0.93366093 0.93672966 0.93077565 0.91704374]
|
|
|
|
mean value: 0.855495824279099
|
|
|
|
key: test_precision
|
|
value: [0.97826087 0.875 0.87179487 0.95744681 0.98245614 0.92424242
|
|
0.87323944 0.953125 0.88235294 0.79761905]
|
|
|
|
mean value: 0.9095537539879266
|
|
|
|
key: train_precision
|
|
value: [0.97119342 0.98701299 0.83835616 0.97905759 0.97228145 0.95101351
|
|
0.93904448 0.94527363 0.95384615 0.85393258]
|
|
|
|
mean value: 0.9391011973075327
|
|
|
|
key: test_recall
|
|
value: [0.66176471 0.30882353 0.98550725 0.65217391 0.82352941 0.89705882
|
|
0.91176471 0.89705882 0.88235294 0.98529412]
|
|
|
|
mean value: 0.8005328218243819
|
|
|
|
key: train_recall
|
|
value: [0.76872964 0.3713355 0.99836868 0.61011419 0.74267101 0.91693811
|
|
0.92833876 0.92833876 0.90879479 0.99022801]
|
|
|
|
mean value: 0.8163857463959487
|
|
|
|
key: test_roc_auc
|
|
value: [0.82363598 0.63267263 0.91922421 0.81138107 0.90441176 0.91176471
|
|
0.88970588 0.92647059 0.88235294 0.86764706]
|
|
|
|
mean value: 0.856926683716965
|
|
|
|
key: train_roc_auc
|
|
value: [0.87294557 0.68322077 0.90309313 0.79854244 0.86074919 0.93485342
|
|
0.93403909 0.93729642 0.93241042 0.91042345]
|
|
|
|
mean value: 0.8767573900983574
|
|
|
|
key: test_jcc
|
|
value: [0.65217391 0.29577465 0.86075949 0.63380282 0.8115942 0.83561644
|
|
0.80519481 0.85915493 0.78947368 0.78823529]
|
|
|
|
mean value: 0.7331780225858255
|
|
|
|
key: train_jcc
|
|
value: [0.75159236 0.36952998 0.8372093 0.60225443 0.72727273 0.8755832
|
|
0.87557604 0.88098918 0.87051482 0.84679666]
|
|
|
|
mean value: 0.763731869782806
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.43791914 0.44354892 0.59035397 0.4713099 0.5238173 0.48726654
|
|
0.46352696 0.45602107 0.49114156 0.51594257]
|
|
|
|
mean value: 0.4880847930908203
|
|
|
|
key: score_time
|
|
value: [0.02496362 0.02302623 0.02460146 0.02421665 0.02488732 0.0236969
|
|
0.03173876 0.02384472 0.02467775 0.02427077]
|
|
|
|
mean value: 0.02499241828918457
|
|
|
|
key: test_mcc
|
|
value: [0.94160273 0.92791659 0.91240409 0.87609014 0.91533482 0.92737353
|
|
0.90184995 0.94158382 0.89949371 0.91334626]
|
|
|
|
mean value: 0.915699564376063
|
|
|
|
key: train_mcc
|
|
value: [0.95970953 0.96772862 0.96266546 0.96633736 0.96951968 0.9547216
|
|
0.97086948 0.9630019 0.95617952 0.96109562]
|
|
|
|
mean value: 0.9631828760865743
|
|
|
|
key: test_accuracy
|
|
value: [0.97080292 0.96350365 0.95620438 0.93430657 0.95588235 0.96323529
|
|
0.94852941 0.97058824 0.94852941 0.95588235]
|
|
|
|
mean value: 0.9567464577071705
|
|
|
|
key: train_accuracy
|
|
value: [0.9796251 0.98370008 0.98125509 0.98288509 0.98452769 0.9771987
|
|
0.98534202 0.98127036 0.97801303 0.98045603]
|
|
|
|
mean value: 0.9814273180262763
|
|
|
|
key: test_fscore
|
|
value: [0.97058824 0.96402878 0.95652174 0.93877551 0.95774648 0.96240602
|
|
0.95104895 0.97101449 0.95035461 0.95714286]
|
|
|
|
mean value: 0.9579627666392394
|
|
|
|
key: train_fscore
|
|
value: [0.97995188 0.98392283 0.98140663 0.98315958 0.98476343 0.97749196
|
|
0.98548387 0.98155573 0.97820823 0.98064516]
|
|
|
|
mean value: 0.9816589318161805
|
|
|
|
key: test_precision
|
|
value: [0.97058824 0.94366197 0.95652174 0.88461538 0.91891892 0.98461538
|
|
0.90666667 0.95714286 0.91780822 0.93055556]
|
|
|
|
mean value: 0.9371094932948388
|
|
|
|
key: train_precision
|
|
value: [0.96524487 0.97142857 0.97275641 0.96687697 0.9699842 0.96507937
|
|
0.97603834 0.96682464 0.9696 0.97124601]
|
|
|
|
mean value: 0.9695079375901355
|
|
|
|
key: test_recall
|
|
value: [0.97058824 0.98529412 0.95652174 1. 1. 0.94117647
|
|
1. 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9809462915601024
|
|
|
|
key: train_recall
|
|
value: [0.99511401 0.99674267 0.99021207 1. 1. 0.99022801
|
|
0.99511401 0.99674267 0.98697068 0.99022801]
|
|
|
|
mean value: 0.994135213692472
|
|
|
|
key: test_roc_auc
|
|
value: [0.97080136 0.96366155 0.95620205 0.93382353 0.95588235 0.96323529
|
|
0.94852941 0.97058824 0.94852941 0.95588235]
|
|
|
|
mean value: 0.9567135549872123
|
|
|
|
key: train_roc_auc
|
|
value: [0.97961247 0.98368944 0.98126239 0.98289902 0.98452769 0.9771987
|
|
0.98534202 0.98127036 0.97801303 0.98045603]
|
|
|
|
mean value: 0.9814271139427496
|
|
|
|
key: test_jcc
|
|
value: [0.94285714 0.93055556 0.91666667 0.88461538 0.91891892 0.92753623
|
|
0.90666667 0.94366197 0.90540541 0.91780822]
|
|
|
|
mean value: 0.9194692163578867
|
|
|
|
key: train_jcc
|
|
value: [0.96069182 0.96835443 0.96349206 0.96687697 0.9699842 0.95597484
|
|
0.97138315 0.96377953 0.95734597 0.96202532]
|
|
|
|
mean value: 0.9639908297791469
|
|
|
|
MCC on Blind test: 0.65
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.27846217 0.28161573 0.26877713 0.2692008 0.27700591 0.27466035
|
|
0.27887869 0.29725051 0.26974511 0.19499779]
|
|
|
|
mean value: 0.269059419631958
|
|
|
|
key: score_time
|
|
value: [0.03478312 0.03430986 0.03051233 0.04022145 0.03191257 0.03114676
|
|
0.03156161 0.03014851 0.03189254 0.04394031]
|
|
|
|
mean value: 0.03404290676116943
|
|
|
|
key: test_mcc
|
|
value: [0.95713391 0.94323594 0.95710706 0.94318882 0.94280904 0.97100831
|
|
0.91533482 0.95681396 0.91533482 0.94280904]
|
|
|
|
mean value: 0.9444775740307054
|
|
|
|
key: train_mcc
|
|
value: [0.99837133 1. 1. 0.99674532 1. 0.99674796
|
|
0.99512588 0.99512588 0.99837266 0.99837266]
|
|
|
|
mean value: 0.9978861698276916
|
|
|
|
key: test_accuracy
|
|
value: [0.97810219 0.97080292 0.97810219 0.97080292 0.97058824 0.98529412
|
|
0.95588235 0.97794118 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9713986689566337
|
|
|
|
key: train_accuracy
|
|
value: [0.999185 1. 1. 0.99837001 1. 0.99837134
|
|
0.997557 0.997557 0.99918567 0.99918567]
|
|
|
|
mean value: 0.9989411689749369
|
|
|
|
key: test_fscore
|
|
value: [0.97841727 0.97142857 0.9787234 0.97183099 0.97142857 0.98550725
|
|
0.95774648 0.97841727 0.95774648 0.97142857]
|
|
|
|
mean value: 0.9722674840953918
|
|
|
|
key: train_fscore
|
|
value: [0.99918633 1. 1. 0.99837134 1. 0.99837398
|
|
0.99756296 0.99756296 0.99918633 0.99918633]
|
|
|
|
mean value: 0.9989430224185503
|
|
|
|
key: test_precision
|
|
value: [0.95774648 0.94444444 0.95833333 0.94520548 0.94444444 0.97142857
|
|
0.91891892 0.95774648 0.91891892 0.94444444]
|
|
|
|
mean value: 0.946163151313161
|
|
|
|
key: train_precision
|
|
value: [0.99837398 1. 1. 0.99674797 1. 0.99675325
|
|
0.99513776 0.99513776 0.99837398 0.99837398]
|
|
|
|
mean value: 0.9978898692194735
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97826087 0.97101449 0.97794118 0.97058824 0.97058824 0.98529412
|
|
0.95588235 0.97794118 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9713981244671782
|
|
|
|
key: train_roc_auc
|
|
value: [0.99918434 1. 1. 0.99837134 1. 0.99837134
|
|
0.997557 0.997557 0.99918567 0.99918567]
|
|
|
|
mean value: 0.9989412352344161
|
|
|
|
key: test_jcc
|
|
value: [0.95774648 0.94444444 0.95833333 0.94520548 0.94444444 0.97142857
|
|
0.91891892 0.95774648 0.91891892 0.94444444]
|
|
|
|
mean value: 0.946163151313161
|
|
|
|
key: train_jcc
|
|
value: [0.99837398 1. 1. 0.99674797 1. 0.99675325
|
|
0.99513776 0.99513776 0.99837398 0.99837398]
|
|
|
|
mean value: 0.9978898692194735
|
|
|
|
MCC on Blind test: 0.66
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.63152909 1.51536298 1.45650649 1.48660588 1.36039352 1.3729527
|
|
1.38597703 1.38487315 1.30995417 1.31962204]
|
|
|
|
mean value: 1.4223777055740356
|
|
|
|
key: score_time
|
|
value: [0.08072424 0.08191442 0.0774982 0.07979679 0.07489109 0.0739162
|
|
0.07431293 0.07339907 0.0698266 0.07072353]
|
|
|
|
mean value: 0.07570030689239501
|
|
|
|
key: test_mcc
|
|
value: [0.90025835 0.84393916 0.94160273 0.90246052 0.91533482 0.88852332
|
|
0.8722811 0.91215932 0.83666003 0.94158382]
|
|
|
|
mean value: 0.8954803171637505
|
|
|
|
key: train_mcc
|
|
value: [0.97232223 0.97396649 0.97235367 0.96906252 0.97070464 0.97232431
|
|
0.97558168 0.96583007 0.97394653 0.97234494]
|
|
|
|
mean value: 0.9718437085256184
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.91970803 0.97080292 0.94890511 0.95588235 0.94117647
|
|
0.93382353 0.95588235 0.91176471 0.97058824]
|
|
|
|
mean value: 0.9457438814942035
|
|
|
|
key: train_accuracy
|
|
value: [0.98614507 0.98696007 0.98614507 0.98451508 0.98534202 0.98615635
|
|
0.98778502 0.98289902 0.98697068 0.98615635]
|
|
|
|
mean value: 0.9859074727427666
|
|
|
|
key: test_fscore
|
|
value: [0.95035461 0.92307692 0.97101449 0.95172414 0.95774648 0.94444444
|
|
0.93706294 0.95652174 0.91891892 0.97101449]
|
|
|
|
mean value: 0.9481879174874257
|
|
|
|
key: train_fscore
|
|
value: [0.98621249 0.98703404 0.98621249 0.98456539 0.98538961 0.98619009
|
|
0.98781478 0.98296837 0.98699187 0.98621249]
|
|
|
|
mean value: 0.9859591623455506
|
|
|
|
key: test_precision
|
|
value: [0.91780822 0.88 0.97101449 0.90789474 0.91891892 0.89473684
|
|
0.89333333 0.94285714 0.85 0.95714286]
|
|
|
|
mean value: 0.9133706543131326
|
|
|
|
key: train_precision
|
|
value: [0.9822294 0.98225806 0.98064516 0.98058252 0.98220065 0.98379254
|
|
0.98541329 0.97899838 0.98538961 0.9822294 ]
|
|
|
|
mean value: 0.9823739031415591
|
|
|
|
key: test_recall
|
|
value: [0.98529412 0.97058824 0.97101449 1. 1. 1.
|
|
0.98529412 0.97058824 1. 0.98529412]
|
|
|
|
mean value: 0.9868073316283035
|
|
|
|
key: train_recall
|
|
value: [0.99022801 0.99185668 0.99184339 0.98858075 0.98859935 0.98859935
|
|
0.99022801 0.98697068 0.98859935 0.99022801]
|
|
|
|
mean value: 0.9895733589810353
|
|
|
|
key: test_roc_auc
|
|
value: [0.9491688 0.92007673 0.97080136 0.94852941 0.95588235 0.94117647
|
|
0.93382353 0.95588235 0.91176471 0.97058824]
|
|
|
|
mean value: 0.9457693947144075
|
|
|
|
key: train_roc_auc
|
|
value: [0.98614174 0.98695607 0.98614971 0.98451839 0.98534202 0.98615635
|
|
0.98778502 0.98289902 0.98697068 0.98615635]
|
|
|
|
mean value: 0.9859075354294308
|
|
|
|
key: test_jcc
|
|
value: [0.90540541 0.85714286 0.94366197 0.90789474 0.91891892 0.89473684
|
|
0.88157895 0.91666667 0.85 0.94366197]
|
|
|
|
mean value: 0.9019668318111609
|
|
|
|
key: train_jcc
|
|
value: [0.9728 0.9744 0.9728 0.9696 0.9712 0.97275641
|
|
0.97592295 0.96650718 0.97431782 0.9728 ]
|
|
|
|
mean value: 0.9723104357755392
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.14314961 1.94710898 1.98164988 2.30299282 2.00202966 2.19179535
|
|
2.07759929 1.97926068 2.0706048 2.0779984 ]
|
|
|
|
mean value: 2.0774189472198485
|
|
|
|
key: score_time
|
|
value: [0.01415825 0.01516271 0.01362658 0.01378727 0.03376055 0.01379681
|
|
0.01373887 0.01374412 0.01389074 0.01364255]
|
|
|
|
mean value: 0.015930843353271485
|
|
|
|
key: test_mcc
|
|
value: [0.98550725 0.95713391 0.95710706 0.88920184 0.97100831 0.95681396
|
|
0.88852332 0.97100831 0.90184995 0.92898531]
|
|
|
|
mean value: 0.9407139215816761
|
|
|
|
key: train_mcc
|
|
value: [0.99350111 0.99350111 0.98865451 0.99026748 0.99350642 0.9902753
|
|
0.9902753 0.98544789 0.98705447 0.9902753 ]
|
|
|
|
mean value: 0.9902758884790179
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.97810219 0.97810219 0.94160584 0.98529412 0.97794118
|
|
0.94117647 0.98529412 0.94852941 0.96323529]
|
|
|
|
mean value: 0.9691981537140404
|
|
|
|
key: train_accuracy
|
|
value: [0.99674002 0.99674002 0.99429503 0.99511002 0.99674267 0.99511401
|
|
0.99511401 0.99267101 0.99348534 0.99511401]
|
|
|
|
mean value: 0.9951126127919849
|
|
|
|
key: test_fscore
|
|
value: [0.99270073 0.97841727 0.9787234 0.94520548 0.98550725 0.97841727
|
|
0.94444444 0.98550725 0.95104895 0.96453901]
|
|
|
|
mean value: 0.9704511041347699
|
|
|
|
key: train_fscore
|
|
value: [0.99675325 0.99675325 0.99432279 0.99512987 0.99675325 0.99513776
|
|
0.99513776 0.99272433 0.99352751 0.99513776]
|
|
|
|
mean value: 0.995137753160077
|
|
|
|
key: test_precision
|
|
value: [0.98550725 0.95774648 0.95833333 0.8961039 0.97142857 0.95774648
|
|
0.89473684 0.97142857 0.90666667 0.93150685]
|
|
|
|
mean value: 0.9431204934504661
|
|
|
|
key: train_precision
|
|
value: [0.99352751 0.99352751 0.98870968 0.99030695 0.99352751 0.99032258
|
|
0.99032258 0.98555377 0.98713826 0.99032258]
|
|
|
|
mean value: 0.9903258926051111
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.99275362 0.97826087 0.97794118 0.94117647 0.98529412 0.97794118
|
|
0.94117647 0.98529412 0.94852941 0.96323529]
|
|
|
|
mean value: 0.9691602728047741
|
|
|
|
key: train_roc_auc
|
|
value: [0.99673736 0.99673736 0.99429967 0.99511401 0.99674267 0.99511401
|
|
0.99511401 0.99267101 0.99348534 0.99511401]
|
|
|
|
mean value: 0.9951129437645796
|
|
|
|
key: test_jcc
|
|
value: [0.98550725 0.95774648 0.95833333 0.8961039 0.97142857 0.95774648
|
|
0.89473684 0.97142857 0.90666667 0.93150685]
|
|
|
|
mean value: 0.9431204934504661
|
|
|
|
key: train_jcc
|
|
value: [0.99352751 0.99352751 0.98870968 0.99030695 0.99352751 0.99032258
|
|
0.99032258 0.98555377 0.98713826 0.99032258]
|
|
|
|
mean value: 0.9903258926051111
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.11367607 0.09891033 0.09971118 0.15056086 0.10581017 0.10995078
|
|
0.12116456 0.09784818 0.09590054 0.105196 ]
|
|
|
|
mean value: 0.10987286567687989
|
|
|
|
key: score_time
|
|
value: [0.02419925 0.01436257 0.01867843 0.02265835 0.02866459 0.02885079
|
|
0.02220368 0.02176857 0.05660486 0.04176497]
|
|
|
|
mean value: 0.02797560691833496
|
|
|
|
key: test_mcc
|
|
value: [1. 0.97122151 1. 1. 1. 0.98540068
|
|
0.98540068 1. 1. 1. ]
|
|
|
|
mean value: 0.9942022862523039
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98540146 1. 1. 1. 0.99264706
|
|
0.99264706 1. 1. 1. ]
|
|
|
|
mean value: 0.9970695577501073
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98550725 1. 1. 1. 0.99270073
|
|
0.99270073 1. 1. 1. ]
|
|
|
|
mean value: 0.9970908706230827
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.97142857 1. 1. 1. 0.98550725
|
|
0.98550725 1. 1. 1. ]
|
|
|
|
mean value: 0.9942443064182195
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98550725 1. 1. 1. 0.99264706
|
|
0.99264706 1. 1. 1. ]
|
|
|
|
mean value: 0.997080136402387
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.97142857 1. 1. 1. 0.98550725
|
|
0.98550725 1. 1. 1. ]
|
|
|
|
mean value: 0.9942443064182195
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06178665 0.08366227 0.05242634 0.06143975 0.05042601 0.04940557
|
|
0.04602766 0.0709939 0.04714084 0.06406546]
|
|
|
|
mean value: 0.058737444877624514
|
|
|
|
key: score_time
|
|
value: [0.04089236 0.02844954 0.0293684 0.04122448 0.0418787 0.04635525
|
|
0.04557586 0.04112935 0.03704405 0.02668762]
|
|
|
|
mean value: 0.037860560417175296
|
|
|
|
key: test_mcc
|
|
value: [0.87086187 0.85739162 0.85440207 0.85721269 0.82495791 0.808911
|
|
0.75665657 0.91176471 0.79967098 0.85628096]
|
|
|
|
mean value: 0.8398110382588702
|
|
|
|
key: train_mcc
|
|
value: [0.86319346 0.87482365 0.86512897 0.8699782 0.88172633 0.87147537
|
|
0.88972295 0.87492825 0.88788715 0.88320483]
|
|
|
|
mean value: 0.8762069139371929
|
|
|
|
key: test_accuracy
|
|
value: [0.93430657 0.9270073 0.9270073 0.9270073 0.91176471 0.90441176
|
|
0.875 0.95588235 0.89705882 0.92647059]
|
|
|
|
mean value: 0.9185916702447402
|
|
|
|
key: train_accuracy
|
|
value: [0.93154034 0.93724531 0.93235534 0.93480033 0.94055375 0.93566775
|
|
0.94462541 0.93729642 0.94381107 0.94136808]
|
|
|
|
mean value: 0.9379263795863431
|
|
|
|
key: test_fscore
|
|
value: [0.93129771 0.92957746 0.92647059 0.93055556 0.90909091 0.90510949
|
|
0.88275862 0.95588235 0.90277778 0.92957746]
|
|
|
|
mean value: 0.9203097932842592
|
|
|
|
key: train_fscore
|
|
value: [0.93214863 0.93815261 0.93333333 0.93569132 0.94164668 0.9362389
|
|
0.94551282 0.93815261 0.94448914 0.94230769]
|
|
|
|
mean value: 0.9387673736356681
|
|
|
|
key: test_precision
|
|
value: [0.96825397 0.89189189 0.94029851 0.89333333 0.9375 0.89855072
|
|
0.83116883 0.95588235 0.85526316 0.89189189]
|
|
|
|
mean value: 0.9064034659476198
|
|
|
|
key: train_precision
|
|
value: [0.92467949 0.92551506 0.9193038 0.92234548 0.92464678 0.928
|
|
0.93059937 0.92551506 0.93322734 0.92744479]
|
|
|
|
mean value: 0.9261277169762157
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.97058824 0.91304348 0.97101449 0.88235294 0.91176471
|
|
0.94117647 0.95588235 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9369352088661551
|
|
|
|
key: train_recall
|
|
value: [0.93973941 0.95114007 0.94779772 0.94942904 0.95928339 0.94462541
|
|
0.96091205 0.95114007 0.95602606 0.95765472]
|
|
|
|
mean value: 0.9517747926308909
|
|
|
|
key: test_roc_auc
|
|
value: [0.93403666 0.9273231 0.92710997 0.92668372 0.91176471 0.90441176
|
|
0.875 0.95588235 0.89705882 0.92647059]
|
|
|
|
mean value: 0.918574168797954
|
|
|
|
key: train_roc_auc
|
|
value: [0.93153365 0.93723398 0.93236791 0.93481224 0.94055375 0.93566775
|
|
0.94462541 0.93729642 0.94381107 0.94136808]
|
|
|
|
mean value: 0.9379270262658681
|
|
|
|
key: test_jcc
|
|
value: [0.87142857 0.86842105 0.8630137 0.87012987 0.83333333 0.82666667
|
|
0.79012346 0.91549296 0.82278481 0.86842105]
|
|
|
|
mean value: 0.8529815470114921
|
|
|
|
key: train_jcc
|
|
value: [0.87291982 0.88350983 0.875 0.87915408 0.8897281 0.8801214
|
|
0.89665653 0.88350983 0.89481707 0.89090909]
|
|
|
|
mean value: 0.8846325755943281
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.6412735 0.60913801 0.64228439 0.55539966 0.62312245 0.62011957
|
|
0.60597944 0.62014318 0.58393216 0.58312726]
|
|
|
|
mean value: 0.6084519624710083
|
|
|
|
key: score_time
|
|
value: [0.0293262 0.02866387 0.02711129 0.02833629 0.03939867 0.03987908
|
|
0.03436828 0.02823949 0.03884816 0.04016376]
|
|
|
|
mean value: 0.03343350887298584
|
|
|
|
key: test_mcc
|
|
value: [0.91392776 0.85739162 0.85440207 0.80402464 0.82495791 0.808911
|
|
0.74337629 0.92657079 0.8131434 0.85628096]
|
|
|
|
mean value: 0.8402986441758847
|
|
|
|
key: train_mcc
|
|
value: [0.8797564 0.87994298 0.86512897 0.8847922 0.88172633 0.87147537
|
|
0.89289191 0.87333954 0.88611102 0.88478855]
|
|
|
|
mean value: 0.8799953256651453
|
|
|
|
key: test_accuracy
|
|
value: [0.95620438 0.9270073 0.9270073 0.89781022 0.91176471 0.90441176
|
|
0.86764706 0.96323529 0.90441176 0.92647059]
|
|
|
|
mean value: 0.918597037355088
|
|
|
|
key: train_accuracy
|
|
value: [0.9396903 0.9396903 0.93235534 0.94213529 0.94055375 0.93566775
|
|
0.94625407 0.93648208 0.94299674 0.94218241]
|
|
|
|
mean value: 0.9398008038461436
|
|
|
|
key: test_fscore
|
|
value: [0.95454545 0.92957746 0.92647059 0.90540541 0.90909091 0.90510949
|
|
0.87671233 0.96296296 0.90909091 0.92957746]
|
|
|
|
mean value: 0.9208542976726618
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:156: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:159: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.94060995 0.9408 0.93333333 0.94306335 0.94164668 0.9362389
|
|
0.9470305 0.93739968 0.94345719 0.94306335]
|
|
|
|
mean value: 0.9406642939843077
|
|
|
|
key: test_precision
|
|
value: [0.984375 0.89189189 0.94029851 0.84810127 0.9375 0.89855072
|
|
0.82051282 0.97014925 0.86666667 0.89189189]
|
|
|
|
mean value: 0.9049938022617767
|
|
|
|
key: train_precision
|
|
value: [0.92721519 0.9245283 0.9193038 0.92744479 0.92464678 0.928
|
|
0.9335443 0.92405063 0.93589744 0.92890995]
|
|
|
|
mean value: 0.9273541191183816
|
|
|
|
key: test_recall
|
|
value: [0.92647059 0.97058824 0.91304348 0.97101449 0.88235294 0.91176471
|
|
0.94117647 0.95588235 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9398763853367433
|
|
|
|
key: train_recall
|
|
value: [0.95439739 0.95765472 0.94779772 0.95921697 0.95928339 0.94462541
|
|
0.96091205 0.95114007 0.95114007 0.95765472]
|
|
|
|
mean value: 0.9543822499481909
|
|
|
|
key: test_roc_auc
|
|
value: [0.95598892 0.9273231 0.92710997 0.89727195 0.91176471 0.90441176
|
|
0.86764706 0.96323529 0.90441176 0.92647059]
|
|
|
|
mean value: 0.9185635123614664
|
|
|
|
key: train_roc_auc
|
|
value: [0.93967831 0.93967565 0.93236791 0.9421492 0.94055375 0.93566775
|
|
0.94625407 0.93648208 0.94299674 0.94218241]
|
|
|
|
mean value: 0.939800787497808
|
|
|
|
key: test_jcc
|
|
value: [0.91304348 0.86842105 0.8630137 0.82716049 0.83333333 0.82666667
|
|
0.7804878 0.92857143 0.83333333 0.86842105]
|
|
|
|
mean value: 0.8542452342764135
|
|
|
|
key: train_jcc
|
|
value: [0.88787879 0.88821752 0.875 0.892261 0.8897281 0.8801214
|
|
0.89939024 0.88217523 0.89296636 0.892261 ]
|
|
|
|
mean value: 0.8879999637648476
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06837058 0.07014608 0.07498884 0.09877658 0.08167028 0.07799149
|
|
0.08944106 0.09302378 0.11649036 0.10551286]
|
|
|
|
mean value: 0.08764119148254394
|
|
|
|
key: score_time
|
|
value: [0.02288461 0.022717 0.02236319 0.02235794 0.02454734 0.02454281
|
|
0.0242734 0.02331662 0.02335644 0.03278542]
|
|
|
|
mean value: 0.024314475059509278
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91666667 0.68313005 0.45454545 0.91287093 0.54772256
|
|
0.56694671 0.56694671 0.54772256 0.2773501 ]
|
|
|
|
mean value: 0.5951174460874726
|
|
|
|
key: train_mcc
|
|
value: [0.84076981 0.82168025 0.82148 0.8100405 0.8014439 0.83103945
|
|
0.8014439 0.81036475 0.83103945 0.81199182]
|
|
|
|
mean value: 0.8181293826538867
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 0.81818182 0.72727273 0.95454545 0.77272727
|
|
0.77272727 0.77272727 0.77272727 0.63636364]
|
|
|
|
mean value: 0.792292490118577
|
|
|
|
key: train_accuracy
|
|
value: [0.91959799 0.90954774 0.91 0.905 0.9 0.915
|
|
0.9 0.905 0.915 0.905 ]
|
|
|
|
mean value: 0.9084145728643216
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.95652174 0.84615385 0.72727273 0.95652174 0.76190476
|
|
0.73684211 0.8 0.7826087 0.6 ]
|
|
|
|
mean value: 0.7895098341780264
|
|
|
|
key: train_fscore
|
|
value: [0.91752577 0.90526316 0.90721649 0.90452261 0.89690722 0.91282051
|
|
0.89690722 0.9035533 0.91282051 0.9015544 ]
|
|
|
|
mean value: 0.905909120126948
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.73333333 0.72727273 0.91666667 0.8
|
|
0.875 0.71428571 0.75 0.66666667]
|
|
|
|
mean value: 0.7910497835497835
|
|
|
|
key: train_precision
|
|
value: [0.94680851 0.94505495 0.93617021 0.90909091 0.92553191 0.93684211
|
|
0.92553191 0.91752577 0.93684211 0.93548387]
|
|
|
|
mean value: 0.9314882262027278
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 1. 0.72727273 1. 0.72727273
|
|
0.63636364 0.90909091 0.81818182 0.54545455]
|
|
|
|
mean value: 0.8007575757575758
|
|
|
|
key: train_recall
|
|
value: [0.89 0.86868687 0.88 0.9 0.87 0.89
|
|
0.87 0.89 0.89 0.87 ]
|
|
|
|
mean value: 0.8818686868686869
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95833333 0.81818182 0.72727273 0.95454545 0.77272727
|
|
0.77272727 0.77272727 0.77272727 0.63636364]
|
|
|
|
mean value: 0.7924242424242424
|
|
|
|
key: train_roc_auc
|
|
value: [0.91974747 0.90934343 0.91 0.905 0.9 0.915
|
|
0.9 0.905 0.915 0.905 ]
|
|
|
|
mean value: 0.9084090909090909
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.91666667 0.73333333 0.57142857 0.91666667 0.61538462
|
|
0.58333333 0.66666667 0.64285714 0.42857143]
|
|
|
|
mean value: 0.6646336996336997
|
|
|
|
key: train_jcc
|
|
value: [0.84761905 0.82692308 0.83018868 0.82568807 0.81308411 0.83962264
|
|
0.81308411 0.82407407 0.83962264 0.82075472]
|
|
|
|
mean value: 0.8280661175555043
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.98393965 1.9290812 2.16848993 1.89898634 1.75924253 1.58824778
|
|
1.68193698 1.5082283 0.945472 1.10976696]
|
|
|
|
mean value: 1.6573391675949096
|
|
|
|
key: score_time
|
|
value: [0.02464771 0.02251983 0.02378535 0.02463198 0.02498102 0.02502775
|
|
0.02192497 0.01491308 0.01212025 0.01495695]
|
|
|
|
mean value: 0.020950889587402342
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.76764947 0.83205029 0.45454545 0.63636364 0.2773501
|
|
0.83205029 0.64715023 0.54772256 0.54772256]
|
|
|
|
mean value: 0.6019877322488625
|
|
|
|
key: train_mcc
|
|
value: [0.92035594 1. 0.92073688 0.89040077 1. 1.
|
|
1. 0.98 0.74014804 0.9900495 ]
|
|
|
|
mean value: 0.9441691143454533
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.86956522 0.90909091 0.72727273 0.81818182 0.63636364
|
|
0.90909091 0.81818182 0.77272727 0.77272727]
|
|
|
|
mean value: 0.7972332015810276
|
|
|
|
key: train_accuracy
|
|
value: [0.95979899 1. 0.96 0.945 1. 1.
|
|
1. 0.99 0.87 0.995 ]
|
|
|
|
mean value: 0.9719798994974874
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.85714286 0.91666667 0.72727273 0.81818182 0.66666667
|
|
0.9 0.83333333 0.7826087 0.7826087 ]
|
|
|
|
mean value: 0.8011754187841145
|
|
|
|
key: train_fscore
|
|
value: [0.95918367 1. 0.95918367 0.94416244 1. 1.
|
|
1. 0.99 0.86868687 0.99502488]
|
|
|
|
mean value: 0.9716241527795758
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.84615385 0.72727273 0.81818182 0.61538462
|
|
1. 0.76923077 0.75 0.75 ]
|
|
|
|
mean value: 0.8003496503496503
|
|
|
|
key: train_precision
|
|
value: [0.97916667 1. 0.97916667 0.95876289 1. 1.
|
|
1. 0.99 0.87755102 0.99009901]
|
|
|
|
mean value: 0.9774746250240425
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.75 1. 0.72727273 0.81818182 0.72727273
|
|
0.81818182 0.90909091 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8113636363636364
|
|
|
|
key: train_recall
|
|
value: [0.94 1. 0.94 0.93 1. 1. 1. 0.99 0.86 1. ]
|
|
|
|
mean value: 0.966
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.875 0.90909091 0.72727273 0.81818182 0.63636364
|
|
0.90909091 0.81818182 0.77272727 0.77272727]
|
|
|
|
mean value: 0.7977272727272727
|
|
|
|
key: train_roc_auc
|
|
value: [0.95989899 1. 0.96 0.945 1. 1.
|
|
1. 0.99 0.87 0.995 ]
|
|
|
|
mean value: 0.971989898989899
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.75 0.84615385 0.57142857 0.69230769 0.5
|
|
0.81818182 0.71428571 0.64285714 0.64285714]
|
|
|
|
mean value: 0.6749500499500499
|
|
|
|
key: train_jcc
|
|
value: [0.92156863 1. 0.92156863 0.89423077 1. 1.
|
|
1. 0.98019802 0.76785714 0.99009901]
|
|
|
|
mean value: 0.9475522196692843
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01519489 0.00943708 0.0094316 0.01037145 0.00944448 0.00915623
|
|
0.00911713 0.00927973 0.00896764 0.0090611 ]
|
|
|
|
mean value: 0.009946131706237793
|
|
|
|
key: score_time
|
|
value: [0.01964593 0.00931048 0.00915456 0.01004958 0.00895691 0.00901747
|
|
0.00918651 0.00874329 0.00890994 0.0088954 ]
|
|
|
|
mean value: 0.010187005996704102
|
|
|
|
key: test_mcc
|
|
value: [0.65909298 1. 0.2773501 0.2773501 0.68313005 0.32539569
|
|
0.48795004 0.46225016 0.29277002 0.09245003]
|
|
|
|
mean value: 0.4557739170966826
|
|
|
|
key: train_mcc
|
|
value: [0.59704686 0.6100504 0.61674214 0.64835272 0.64205788 0.57487842
|
|
0.67095904 0.63930569 0.59145083 0.60302269]
|
|
|
|
mean value: 0.6193866676982089
|
|
|
|
key: test_accuracy
|
|
value: [0.82608696 1. 0.63636364 0.63636364 0.81818182 0.63636364
|
|
0.72727273 0.72727273 0.63636364 0.54545455]
|
|
|
|
mean value: 0.7189723320158102
|
|
|
|
key: train_accuracy
|
|
value: [0.79396985 0.79899497 0.79 0.82 0.82 0.785
|
|
0.83 0.815 0.795 0.8 ]
|
|
|
|
mean value: 0.8047964824120603
|
|
|
|
key: test_fscore
|
|
value: [0.8 1. 0.6 0.6 0.77777778 0.5
|
|
0.66666667 0.7 0.55555556 0.58333333]
|
|
|
|
mean value: 0.6783333333333333
|
|
|
|
key: train_fscore
|
|
value: [0.77595628 0.7752809 0.74698795 0.80434783 0.8125 0.77005348
|
|
0.81318681 0.79781421 0.78756477 0.78947368]
|
|
|
|
mean value: 0.7873165908746415
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.66666667 0.66666667 1. 0.8
|
|
0.85714286 0.77777778 0.71428571 0.53846154]
|
|
|
|
mean value: 0.790989010989011
|
|
|
|
key: train_precision
|
|
value: [0.85542169 0.87341772 0.93939394 0.88095238 0.84782609 0.82758621
|
|
0.90243902 0.87951807 0.8172043 0.83333333]
|
|
|
|
mean value: 0.8657092753553371
|
|
|
|
key: test_recall
|
|
value: [0.72727273 1. 0.54545455 0.54545455 0.63636364 0.36363636
|
|
0.54545455 0.63636364 0.45454545 0.63636364]
|
|
|
|
mean value: 0.6090909090909091
|
|
|
|
key: train_recall
|
|
value: [0.71 0.6969697 0.62 0.74 0.78 0.72 0.74
|
|
0.73 0.76 0.75 ]
|
|
|
|
mean value: 0.7246969696969697
|
|
|
|
key: test_roc_auc
|
|
value: [0.8219697 1. 0.63636364 0.63636364 0.81818182 0.63636364
|
|
0.72727273 0.72727273 0.63636364 0.54545455]
|
|
|
|
mean value: 0.718560606060606
|
|
|
|
key: train_roc_auc
|
|
value: [0.79439394 0.79848485 0.79 0.82 0.82 0.785
|
|
0.83 0.815 0.795 0.8 ]
|
|
|
|
mean value: 0.8047878787878788
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 1. 0.42857143 0.42857143 0.63636364 0.33333333
|
|
0.5 0.53846154 0.38461538 0.41176471]
|
|
|
|
mean value: 0.5328348122465769
|
|
|
|
key: train_jcc
|
|
value: [0.63392857 0.63302752 0.59615385 0.67272727 0.68421053 0.62608696
|
|
0.68518519 0.66363636 0.64957265 0.65217391]
|
|
|
|
mean value: 0.6496702807520676
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00917482 0.00915647 0.00912809 0.00903654 0.0090971 0.00903273
|
|
0.00915027 0.00949216 0.00917816 0.00924921]
|
|
|
|
mean value: 0.009169554710388184
|
|
|
|
key: score_time
|
|
value: [0.00869656 0.00861049 0.00864029 0.0086565 0.00863481 0.00874043
|
|
0.00908256 0.00869799 0.00867963 0.00878263]
|
|
|
|
mean value: 0.008722186088562012
|
|
|
|
key: test_mcc
|
|
value: [ 0.39393939 1. 0.46225016 0.18898224 0.46225016 0.54772256
|
|
0.2773501 0.36514837 -0.09245003 0.18257419]
|
|
|
|
mean value: 0.3787767137904798
|
|
|
|
key: train_mcc
|
|
value: [0.61070966 0.57109279 0.54043252 0.64205788 0.60048058 0.58292193
|
|
0.60108292 0.59360222 0.67573429 0.59675165]
|
|
|
|
mean value: 0.601486644761032
|
|
|
|
key: test_accuracy
|
|
value: [0.69565217 1. 0.72727273 0.59090909 0.72727273 0.77272727
|
|
0.63636364 0.68181818 0.45454545 0.59090909]
|
|
|
|
mean value: 0.6877470355731226
|
|
|
|
key: train_accuracy
|
|
value: [0.8040201 0.7839196 0.77 0.82 0.8 0.79 0.8
|
|
0.795 0.835 0.795 ]
|
|
|
|
mean value: 0.7992939698492463
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 1. 0.75 0.52631579 0.7 0.76190476
|
|
0.6 0.69565217 0.4 0.57142857]
|
|
|
|
mean value: 0.6700953470633104
|
|
|
|
key: train_fscore
|
|
value: [0.79581152 0.77005348 0.76530612 0.8125 0.79591837 0.77894737
|
|
0.79381443 0.78306878 0.82352941 0.77837838]
|
|
|
|
mean value: 0.7897327858678965
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.69230769 0.625 0.77777778 0.8
|
|
0.66666667 0.66666667 0.44444444 0.6 ]
|
|
|
|
mean value: 0.6939529914529914
|
|
|
|
key: train_precision
|
|
value: [0.83516484 0.81818182 0.78125 0.84782609 0.8125 0.82222222
|
|
0.81914894 0.83146067 0.88505747 0.84705882]
|
|
|
|
mean value: 0.8299870867646693
|
|
|
|
key: test_recall
|
|
value: [0.72727273 1. 0.81818182 0.45454545 0.63636364 0.72727273
|
|
0.54545455 0.72727273 0.36363636 0.54545455]
|
|
|
|
mean value: 0.6545454545454545
|
|
|
|
key: train_recall
|
|
value: [0.76 0.72727273 0.75 0.78 0.78 0.74
|
|
0.77 0.74 0.77 0.72 ]
|
|
|
|
mean value: 0.7537272727272727
|
|
|
|
key: test_roc_auc
|
|
value: [0.6969697 1. 0.72727273 0.59090909 0.72727273 0.77272727
|
|
0.63636364 0.68181818 0.45454545 0.59090909]
|
|
|
|
mean value: 0.6878787878787879
|
|
|
|
key: train_roc_auc
|
|
value: [0.80424242 0.78363636 0.77 0.82 0.8 0.79
|
|
0.8 0.795 0.835 0.795 ]
|
|
|
|
mean value: 0.7992878787878788
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 1. 0.6 0.35714286 0.53846154 0.61538462
|
|
0.42857143 0.53333333 0.25 0.4 ]
|
|
|
|
mean value: 0.5256227106227106
|
|
|
|
key: train_jcc
|
|
value: [0.66086957 0.62608696 0.61983471 0.68421053 0.66101695 0.63793103
|
|
0.65811966 0.64347826 0.7 0.63716814]
|
|
|
|
mean value: 0.6528715803016166
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00928378 0.00874376 0.00974417 0.00925326 0.00961709 0.00862384
|
|
0.00910091 0.00888395 0.00974298 0.00866914]
|
|
|
|
mean value: 0.009166288375854491
|
|
|
|
key: score_time
|
|
value: [0.01568413 0.01454878 0.01571488 0.01478529 0.01504016 0.01515985
|
|
0.01496387 0.0168469 0.01541829 0.01537824]
|
|
|
|
mean value: 0.015354037284851074
|
|
|
|
key: test_mcc
|
|
value: [0.17236256 0.83971912 0.46225016 0.36514837 0.2773501 0.18898224
|
|
0.46225016 0.36514837 0.64715023 0.09090909]
|
|
|
|
mean value: 0.3871270410929328
|
|
|
|
key: train_mcc
|
|
value: [0.54790792 0.45827063 0.5100255 0.58011603 0.55024767 0.61076393
|
|
0.56 0.53215963 0.51022966 0.56101073]
|
|
|
|
mean value: 0.5420731707552674
|
|
|
|
key: test_accuracy
|
|
value: [0.56521739 0.91304348 0.72727273 0.68181818 0.63636364 0.59090909
|
|
0.72727273 0.68181818 0.81818182 0.54545455]
|
|
|
|
mean value: 0.6887351778656127
|
|
|
|
key: train_accuracy
|
|
value: [0.77386935 0.72864322 0.755 0.79 0.775 0.805
|
|
0.78 0.765 0.755 0.78 ]
|
|
|
|
mean value: 0.7707512562814071
|
|
|
|
key: test_fscore
|
|
value: [0.64285714 0.90909091 0.75 0.69565217 0.6 0.64
|
|
0.7 0.66666667 0.8 0.54545455]
|
|
|
|
mean value: 0.6949721437982308
|
|
|
|
key: train_fscore
|
|
value: [0.77832512 0.73529412 0.75376884 0.79207921 0.77832512 0.8097561
|
|
0.78 0.77511962 0.75862069 0.78640777]
|
|
|
|
mean value: 0.7747696587525694
|
|
|
|
key: test_precision
|
|
value: [0.52941176 1. 0.69230769 0.66666667 0.66666667 0.57142857
|
|
0.77777778 0.7 0.88888889 0.54545455]
|
|
|
|
mean value: 0.7038602573896692
|
|
|
|
key: train_precision
|
|
value: [0.76699029 0.71428571 0.75757576 0.78431373 0.76699029 0.79047619
|
|
0.78 0.74311927 0.74757282 0.76415094]
|
|
|
|
mean value: 0.7615474995337383
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.83333333 0.81818182 0.72727273 0.54545455 0.72727273
|
|
0.63636364 0.63636364 0.72727273 0.54545455]
|
|
|
|
mean value: 0.7015151515151515
|
|
|
|
key: train_recall
|
|
value: [0.79 0.75757576 0.75 0.8 0.79 0.83
|
|
0.78 0.81 0.77 0.81 ]
|
|
|
|
mean value: 0.7887575757575758
|
|
|
|
key: test_roc_auc
|
|
value: [0.57575758 0.91666667 0.72727273 0.68181818 0.63636364 0.59090909
|
|
0.72727273 0.68181818 0.81818182 0.54545455]
|
|
|
|
mean value: 0.6901515151515152
|
|
|
|
key: train_roc_auc
|
|
value: [0.77378788 0.72878788 0.755 0.79 0.775 0.805
|
|
0.78 0.765 0.755 0.78 ]
|
|
|
|
mean value: 0.7707575757575758
|
|
|
|
key: test_jcc
|
|
value: [0.47368421 0.83333333 0.6 0.53333333 0.42857143 0.47058824
|
|
0.53846154 0.5 0.66666667 0.375 ]
|
|
|
|
mean value: 0.5419638746186733
|
|
|
|
key: train_jcc
|
|
value: [0.63709677 0.58139535 0.60483871 0.6557377 0.63709677 0.68032787
|
|
0.63934426 0.6328125 0.61111111 0.648 ]
|
|
|
|
mean value: 0.632776105407841
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01423049 0.01355338 0.0119102 0.01166868 0.01288462 0.01357532
|
|
0.01195931 0.01357794 0.01178098 0.01234269]
|
|
|
|
mean value: 0.012748360633850098
|
|
|
|
key: score_time
|
|
value: [0.01121068 0.00992894 0.00999022 0.00975442 0.01017022 0.01056528
|
|
0.00981355 0.01052999 0.00942135 0.00952411]
|
|
|
|
mean value: 0.010090875625610351
|
|
|
|
key: test_mcc
|
|
value: [0.39393939 0.91666667 0.64715023 0.45454545 0.54772256 0.36514837
|
|
0.56694671 0.56694671 0.36514837 0.27272727]
|
|
|
|
mean value: 0.5096941736681291
|
|
|
|
key: train_mcc
|
|
value: [0.79917164 0.72948996 0.77313757 0.76015205 0.7500375 0.76244374
|
|
0.74133561 0.77034673 0.75093926 0.76015205]
|
|
|
|
mean value: 0.7597206109096323
|
|
|
|
key: test_accuracy
|
|
value: [0.69565217 0.95652174 0.81818182 0.72727273 0.77272727 0.68181818
|
|
0.77272727 0.77272727 0.68181818 0.63636364]
|
|
|
|
mean value: 0.7515810276679842
|
|
|
|
key: train_accuracy
|
|
value: [0.89949749 0.86432161 0.885 0.88 0.875 0.88
|
|
0.87 0.885 0.875 0.88 ]
|
|
|
|
mean value: 0.8793819095477386
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 0.95652174 0.83333333 0.72727273 0.7826087 0.69565217
|
|
0.73684211 0.8 0.69565217 0.63636364]
|
|
|
|
mean value: 0.7559898758754594
|
|
|
|
key: train_fscore
|
|
value: [0.8989899 0.86010363 0.87958115 0.88118812 0.87437186 0.88461538
|
|
0.86597938 0.88324873 0.87804878 0.87878788]
|
|
|
|
mean value: 0.8784914812172563
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.76923077 0.72727273 0.75 0.66666667
|
|
0.875 0.71428571 0.66666667 0.63636364]
|
|
|
|
mean value: 0.7472152847152848
|
|
|
|
key: train_precision
|
|
value: [0.90816327 0.88297872 0.92307692 0.87254902 0.87878788 0.85185185
|
|
0.89361702 0.89690722 0.85714286 0.8877551 ]
|
|
|
|
mean value: 0.8852829858989989
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 0.90909091 0.72727273 0.81818182 0.72727273
|
|
0.63636364 0.90909091 0.72727273 0.63636364]
|
|
|
|
mean value: 0.7734848484848484
|
|
|
|
key: train_recall
|
|
value: [0.89 0.83838384 0.84 0.89 0.87 0.92
|
|
0.84 0.87 0.9 0.87 ]
|
|
|
|
mean value: 0.8728383838383839
|
|
|
|
key: test_roc_auc
|
|
value: [0.6969697 0.95833333 0.81818182 0.72727273 0.77272727 0.68181818
|
|
0.77272727 0.77272727 0.68181818 0.63636364]
|
|
|
|
mean value: 0.7518939393939393
|
|
|
|
key: train_roc_auc
|
|
value: [0.89954545 0.86419192 0.885 0.88 0.875 0.88
|
|
0.87 0.885 0.875 0.88 ]
|
|
|
|
mean value: 0.8793737373737374
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 0.91666667 0.71428571 0.57142857 0.64285714 0.53333333
|
|
0.58333333 0.66666667 0.53333333 0.46666667]
|
|
|
|
mean value: 0.6161904761904762
|
|
|
|
key: train_jcc
|
|
value: [0.81651376 0.75454545 0.78504673 0.78761062 0.77678571 0.79310345
|
|
0.76363636 0.79090909 0.7826087 0.78378378]
|
|
|
|
mean value: 0.7834543660997322
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.97043252 0.98709679 0.94861197 1.01687264 1.09533525 0.92243099
|
|
0.88711429 0.95853901 0.8539257 0.86596727]
|
|
|
|
mean value: 0.950632643699646
|
|
|
|
key: score_time
|
|
value: [0.01569128 0.01483226 0.01499128 0.01794028 0.01487207 0.01535821
|
|
0.01524711 0.01560092 0.01368737 0.01333737]
|
|
|
|
mean value: 0.015155816078186035
|
|
|
|
key: test_mcc
|
|
value: [0.48856385 0.82575758 0.54772256 0.36514837 0.63636364 0.45454545
|
|
0.73029674 0.64715023 0.46225016 0.36514837]
|
|
|
|
mean value: 0.5522946956544701
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.91304348 0.77272727 0.68181818 0.81818182 0.72727273
|
|
0.86363636 0.81818182 0.72727273 0.68181818]
|
|
|
|
mean value: 0.7743083003952569
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.91666667 0.7826087 0.66666667 0.81818182 0.72727273
|
|
0.85714286 0.83333333 0.75 0.69565217]
|
|
|
|
mean value: 0.7797524938829287
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.91666667 0.75 0.7 0.81818182 0.72727273
|
|
0.9 0.76923077 0.69230769 0.66666667]
|
|
|
|
mean value: 0.7632634032634033
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.91666667 0.81818182 0.63636364 0.81818182 0.72727273
|
|
0.81818182 0.90909091 0.81818182 0.72727273]
|
|
|
|
mean value: 0.8007575757575758
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.74242424 0.91287879 0.77272727 0.68181818 0.81818182 0.72727273
|
|
0.86363636 0.81818182 0.72727273 0.68181818]
|
|
|
|
mean value: 0.7746212121212122
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.84615385 0.64285714 0.5 0.69230769 0.57142857
|
|
0.75 0.71428571 0.6 0.53333333]
|
|
|
|
mean value: 0.64503663003663
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.49
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02525091 0.02160215 0.02029204 0.01865292 0.01829171 0.01900959
|
|
0.017452 0.01893425 0.01651978 0.0175488 ]
|
|
|
|
mean value: 0.019355416297912598
|
|
|
|
key: score_time
|
|
value: [0.01383924 0.01080704 0.01054907 0.01040816 0.00994539 0.00939131
|
|
0.00986552 0.0096252 0.00948071 0.0104351 ]
|
|
|
|
mean value: 0.010434675216674804
|
|
|
|
key: test_mcc
|
|
value: [0.56818182 0.83971912 0.54772256 0.46225016 0.54772256 0.18898224
|
|
0.81818182 0.63636364 0.63636364 0.64715023]
|
|
|
|
mean value: 0.5892637775815944
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.7826087 0.91304348 0.77272727 0.72727273 0.77272727 0.59090909
|
|
0.90909091 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.792292490118577
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.90909091 0.7826087 0.75 0.7826087 0.52631579
|
|
0.90909091 0.81818182 0.81818182 0.8 ]
|
|
|
|
mean value: 0.787868733097566
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 1. 0.75 0.69230769 0.75 0.625
|
|
0.90909091 0.81818182 0.81818182 0.88888889]
|
|
|
|
mean value: 0.8001651126651127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.83333333 0.81818182 0.81818182 0.81818182 0.45454545
|
|
0.90909091 0.81818182 0.81818182 0.72727273]
|
|
|
|
mean value: 0.7833333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78409091 0.91666667 0.77272727 0.72727273 0.77272727 0.59090909
|
|
0.90909091 0.81818182 0.81818182 0.81818182]
|
|
|
|
mean value: 0.7928030303030303
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.83333333 0.64285714 0.6 0.64285714 0.35714286
|
|
0.83333333 0.69230769 0.69230769 0.66666667]
|
|
|
|
mean value: 0.6603663003663004
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11458826 0.10680294 0.10583305 0.1013813 0.09803462 0.09663272
|
|
0.09713149 0.10008979 0.09998441 0.10197449]
|
|
|
|
mean value: 0.10224530696868897
|
|
|
|
key: score_time
|
|
value: [0.01931691 0.01970983 0.0188899 0.01814747 0.01751494 0.01744604
|
|
0.01884842 0.01791477 0.01759744 0.01804161]
|
|
|
|
mean value: 0.01834273338317871
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91666667 0.73029674 0.54772256 0.81818182 0.45454545
|
|
0.75592895 0.75592895 0.64715023 0.45454545]
|
|
|
|
mean value: 0.6558239543023852
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 0.86363636 0.77272727 0.90909091 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.72727273]
|
|
|
|
mean value: 0.8241106719367589
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.95652174 0.86956522 0.76190476 0.90909091 0.72727273
|
|
0.88 0.88 0.83333333 0.72727273]
|
|
|
|
mean value: 0.8272234142668925
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.83333333 0.8 0.90909091 0.72727273
|
|
0.78571429 0.78571429 0.76923077 0.72727273]
|
|
|
|
mean value: 0.8064901764901765
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 0.90909091 0.72727273 0.90909091 0.72727273
|
|
1. 1. 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8553030303030303
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95833333 0.86363636 0.77272727 0.90909091 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.72727273]
|
|
|
|
mean value: 0.8242424242424242
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.91666667 0.76923077 0.61538462 0.83333333 0.57142857
|
|
0.78571429 0.78571429 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7134615384615385
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01024938 0.00933266 0.01019812 0.01005816 0.00927067 0.01000428
|
|
0.01039481 0.01046586 0.00951385 0.01014614]
|
|
|
|
mean value: 0.009963393211364746
|
|
|
|
key: score_time
|
|
value: [0.00944662 0.00883985 0.00952673 0.00927114 0.00905871 0.00973105
|
|
0.0096364 0.00956321 0.009341 0.00951195]
|
|
|
|
mean value: 0.009392666816711425
|
|
|
|
key: test_mcc
|
|
value: [ 0.3030303 0.66414149 0.36514837 0. 0.54772256 0.2773501
|
|
0.37796447 0.36514837 0.36514837 -0.09090909]
|
|
|
|
mean value: 0.31747449437593517
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.65217391 0.82608696 0.68181818 0.5 0.77272727 0.63636364
|
|
0.68181818 0.68181818 0.68181818 0.45454545]
|
|
|
|
mean value: 0.6569169960474308
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.63636364 0.81818182 0.66666667 0.47619048 0.7826087 0.66666667
|
|
0.72 0.66666667 0.66666667 0.45454545]
|
|
|
|
mean value: 0.6554556747600225
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.9 0.7 0.5 0.75 0.61538462
|
|
0.64285714 0.7 0.7 0.45454545]
|
|
|
|
mean value: 0.6599150849150849
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.63636364 0.75 0.63636364 0.45454545 0.81818182 0.72727273
|
|
0.81818182 0.63636364 0.63636364 0.45454545]
|
|
|
|
mean value: 0.6568181818181819
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.65151515 0.82954545 0.68181818 0.5 0.77272727 0.63636364
|
|
0.68181818 0.68181818 0.68181818 0.45454545]
|
|
|
|
mean value: 0.6571969696969697
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.46666667 0.69230769 0.5 0.3125 0.64285714 0.5
|
|
0.5625 0.5 0.5 0.29411765]
|
|
|
|
mean value: 0.49709491488903257
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.39370704 1.41524959 1.35751224 1.38201571 1.38745046 1.37072587
|
|
1.37028241 1.33769202 1.36783028 1.3226316 ]
|
|
|
|
mean value: 1.370509719848633
|
|
|
|
key: score_time
|
|
value: [0.09871769 0.09867597 0.09723949 0.09882069 0.09869909 0.15835905
|
|
0.09611416 0.09787035 0.09561229 0.09823298]
|
|
|
|
mean value: 0.10383417606353759
|
|
|
|
key: test_mcc
|
|
value: [0.56818182 0.91666667 0.91287093 0.54772256 0.64715023 0.46225016
|
|
0.83205029 0.91287093 0.73029674 0.45454545]
|
|
|
|
mean value: 0.6984605785378183
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.7826087 0.95652174 0.95454545 0.77272727 0.81818182 0.72727273
|
|
0.90909091 0.95454545 0.86363636 0.72727273]
|
|
|
|
mean value: 0.8466403162055336
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.95652174 0.95238095 0.7826087 0.83333333 0.7
|
|
0.9 0.95652174 0.86956522 0.72727273]
|
|
|
|
mean value: 0.8460813099943535
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 1. 1. 0.75 0.76923077 0.77777778
|
|
1. 0.91666667 0.83333333 0.72727273]
|
|
|
|
mean value: 0.8524281274281275
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.91666667 0.90909091 0.81818182 0.90909091 0.63636364
|
|
0.81818182 1. 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8462121212121212
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78409091 0.95833333 0.95454545 0.77272727 0.81818182 0.72727273
|
|
0.90909091 0.95454545 0.86363636 0.72727273]
|
|
|
|
mean value: 0.8469696969696969
|
|
|
|
key: train_roc_auc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.91666667 0.90909091 0.64285714 0.71428571 0.53846154
|
|
0.81818182 0.91666667 0.76923077 0.57142857]
|
|
|
|
mean value: 0.7439726939726939
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87208843 0.89681029 0.89619946 0.83190942 0.91177535 0.97973609
|
|
0.93089819 0.90625811 0.86904597 0.93702197]
|
|
|
|
mean value: 0.9031743288040162
|
|
|
|
key: score_time
|
|
value: [0.17905688 0.20153785 0.17578483 0.1625216 0.16287255 0.22656584
|
|
0.14112735 0.2074976 0.22874093 0.134022 ]
|
|
|
|
mean value: 0.18197274208068848
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91666667 1. 0.63636364 0.81818182 0.64715023
|
|
0.83205029 0.91287093 0.73029674 0.2773501 ]
|
|
|
|
mean value: 0.7248203142380238
|
|
|
|
key: train_mcc
|
|
value: [0.96989899 0.95998792 0.95042779 0.95042779 0.94018806 0.9700485
|
|
0.95042779 0.95042779 0.96076892 0.94018806]
|
|
|
|
mean value: 0.9542791603538276
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 1. 0.81818182 0.90909091 0.81818182
|
|
0.90909091 0.95454545 0.86363636 0.63636364]
|
|
|
|
mean value: 0.8604743083003953
|
|
|
|
key: train_accuracy
|
|
value: [0.98492462 0.9798995 0.975 0.975 0.97 0.985
|
|
0.975 0.975 0.98 0.97 ]
|
|
|
|
mean value: 0.9769824120603015
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.95652174 1. 0.81818182 0.90909091 0.8
|
|
0.9 0.95652174 0.86956522 0.6 ]
|
|
|
|
mean value: 0.8537154150197629
|
|
|
|
key: train_fscore
|
|
value: [0.98492462 0.97959184 0.97461929 0.97461929 0.96969697 0.98492462
|
|
0.97461929 0.97461929 0.97959184 0.96969697]
|
|
|
|
mean value: 0.9766904016454888
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 1. 0.81818182 0.90909091 0.88888889
|
|
1. 0.91666667 0.83333333 0.66666667]
|
|
|
|
mean value: 0.876010101010101
|
|
|
|
key: train_precision
|
|
value: [0.98989899 0.98969072 0.98969072 0.98969072 0.97959184 0.98989899
|
|
0.98969072 0.98969072 1. 0.97959184]
|
|
|
|
mean value: 0.9887435261514791
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 1. 0.81818182 0.90909091 0.72727273
|
|
0.81818182 1. 0.90909091 0.54545455]
|
|
|
|
mean value: 0.8371212121212122
|
|
|
|
key: train_recall
|
|
value: [0.98 0.96969697 0.96 0.96 0.96 0.98
|
|
0.96 0.96 0.96 0.96 ]
|
|
|
|
mean value: 0.9649696969696969
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95833333 1. 0.81818182 0.90909091 0.81818182
|
|
0.90909091 0.95454545 0.86363636 0.63636364]
|
|
|
|
mean value: 0.8606060606060606
|
|
|
|
key: train_roc_auc
|
|
value: [0.98494949 0.97984848 0.975 0.975 0.97 0.985
|
|
0.975 0.975 0.98 0.97 ]
|
|
|
|
mean value: 0.976979797979798
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.91666667 1. 0.69230769 0.83333333 0.66666667
|
|
0.81818182 0.91666667 0.76923077 0.42857143]
|
|
|
|
mean value: 0.7613053613053613
|
|
|
|
key: train_jcc
|
|
value: [0.97029703 0.96 0.95049505 0.95049505 0.94117647 0.97029703
|
|
0.95049505 0.95049505 0.96 0.94117647]
|
|
|
|
mean value: 0.9544927198602213
|
|
|
|
MCC on Blind test: 0.55
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01043224 0.00909019 0.01290274 0.01403213 0.01533985 0.01524782
|
|
0.01515961 0.01238012 0.01442981 0.01607943]
|
|
|
|
mean value: 0.013509392738342285
|
|
|
|
key: score_time
|
|
value: [0.00940108 0.01727343 0.01171255 0.01217628 0.01322937 0.01409483
|
|
0.01232147 0.01480937 0.01329613 0.01417756]
|
|
|
|
mean value: 0.01324920654296875
|
|
|
|
key: test_mcc
|
|
value: [ 0.39393939 1. 0.46225016 0.18898224 0.46225016 0.54772256
|
|
0.2773501 0.36514837 -0.09245003 0.18257419]
|
|
|
|
mean value: 0.3787767137904798
|
|
|
|
key: train_mcc
|
|
value: [0.61070966 0.57109279 0.54043252 0.64205788 0.60048058 0.58292193
|
|
0.60108292 0.59360222 0.67573429 0.59675165]
|
|
|
|
mean value: 0.601486644761032
|
|
|
|
key: test_accuracy
|
|
value: [0.69565217 1. 0.72727273 0.59090909 0.72727273 0.77272727
|
|
0.63636364 0.68181818 0.45454545 0.59090909]
|
|
|
|
mean value: 0.6877470355731226
|
|
|
|
key: train_accuracy
|
|
value: [0.8040201 0.7839196 0.77 0.82 0.8 0.79 0.8
|
|
0.795 0.835 0.795 ]
|
|
|
|
mean value: 0.7992939698492463
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 1. 0.75 0.52631579 0.7 0.76190476
|
|
0.6 0.69565217 0.4 0.57142857]
|
|
|
|
mean value: 0.6700953470633104
|
|
|
|
key: train_fscore
|
|
value: [0.79581152 0.77005348 0.76530612 0.8125 0.79591837 0.77894737
|
|
0.79381443 0.78306878 0.82352941 0.77837838]
|
|
|
|
mean value: 0.7897327858678965
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.69230769 0.625 0.77777778 0.8
|
|
0.66666667 0.66666667 0.44444444 0.6 ]
|
|
|
|
mean value: 0.6939529914529914
|
|
|
|
key: train_precision
|
|
value: [0.83516484 0.81818182 0.78125 0.84782609 0.8125 0.82222222
|
|
0.81914894 0.83146067 0.88505747 0.84705882]
|
|
|
|
mean value: 0.8299870867646693
|
|
|
|
key: test_recall
|
|
value: [0.72727273 1. 0.81818182 0.45454545 0.63636364 0.72727273
|
|
0.54545455 0.72727273 0.36363636 0.54545455]
|
|
|
|
mean value: 0.6545454545454545
|
|
|
|
key: train_recall
|
|
value: [0.76 0.72727273 0.75 0.78 0.78 0.74
|
|
0.77 0.74 0.77 0.72 ]
|
|
|
|
mean value: 0.7537272727272727
|
|
|
|
key: test_roc_auc
|
|
value: [0.6969697 1. 0.72727273 0.59090909 0.72727273 0.77272727
|
|
0.63636364 0.68181818 0.45454545 0.59090909]
|
|
|
|
mean value: 0.6878787878787879
|
|
|
|
key: train_roc_auc
|
|
value: [0.80424242 0.78363636 0.77 0.82 0.8 0.79
|
|
0.8 0.795 0.835 0.795 ]
|
|
|
|
mean value: 0.7992878787878788
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 1. 0.6 0.35714286 0.53846154 0.61538462
|
|
0.42857143 0.53333333 0.25 0.4 ]
|
|
|
|
mean value: 0.5256227106227106
|
|
|
|
key: train_jcc
|
|
value: [0.66086957 0.62608696 0.61983471 0.68421053 0.66101695 0.63793103
|
|
0.65811966 0.64347826 0.7 0.63716814]
|
|
|
|
mean value: 0.6528715803016166
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.70163393 0.10867262 0.21027184 1.58362126 0.15635705 0.38029337
|
|
1.38273144 0.10912514 0.25329494 1.5216136 ]
|
|
|
|
mean value: 0.6407615184783936
|
|
|
|
key: score_time
|
|
value: [0.01133823 0.01368117 0.01305103 0.01234221 0.01410246 0.01277947
|
|
0.01145935 0.0133307 0.01315022 0.01216602]
|
|
|
|
mean value: 0.012740087509155274
|
|
|
|
key: test_mcc
|
|
value: [0.66414149 0.83971912 1. 0.83205029 0.73029674 0.73029674
|
|
0.83205029 0.73029674 0.73029674 0.83205029]
|
|
|
|
mean value: 0.7921198467134848
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.82608696 0.91304348 1. 0.90909091 0.86363636 0.86363636
|
|
0.90909091 0.86363636 0.86363636 0.90909091]
|
|
|
|
mean value: 0.892094861660079
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.90909091 1. 0.9 0.85714286 0.85714286
|
|
0.9 0.86956522 0.86956522 0.9 ]
|
|
|
|
mean value: 0.8895840391492565
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76923077 1. 1. 1. 0.9 0.9
|
|
1. 0.83333333 0.83333333 1. ]
|
|
|
|
mean value: 0.9235897435897436
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.83333333 1. 0.81818182 0.81818182 0.81818182
|
|
0.81818182 0.90909091 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8651515151515152
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.91666667 1. 0.90909091 0.86363636 0.86363636
|
|
0.90909091 0.86363636 0.86363636 0.90909091]
|
|
|
|
mean value: 0.8928030303030303
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.83333333 1. 0.81818182 0.75 0.75
|
|
0.81818182 0.76923077 0.76923077 0.81818182]
|
|
|
|
mean value: 0.8040626040626041
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05875063 0.02724981 0.03360963 0.03436136 0.06647086 0.03612995
|
|
0.0561471 0.06006169 0.05103612 0.07747197]
|
|
|
|
mean value: 0.05012891292572021
|
|
|
|
key: score_time
|
|
value: [0.04140472 0.01212978 0.01214314 0.02116013 0.01222682 0.01217699
|
|
0.02134228 0.0206449 0.02383661 0.02042603]
|
|
|
|
mean value: 0.019749140739440917
|
|
|
|
key: test_mcc
|
|
value: [0.74047959 0.58930667 0.63636364 0.2773501 0.54772256 0.36514837
|
|
0.68313005 0.36514837 0.56694671 0.36514837]
|
|
|
|
mean value: 0.5136744424225118
|
|
|
|
key: train_mcc
|
|
value: [0.96989899 0.950172 0.9900495 0.9900495 0.98019606 0.9900495
|
|
1. 0.97043679 1. 0.98019606]
|
|
|
|
mean value: 0.9821048409556434
|
|
|
|
key: test_accuracy
|
|
value: [0.86956522 0.7826087 0.81818182 0.63636364 0.77272727 0.68181818
|
|
0.81818182 0.68181818 0.77272727 0.68181818]
|
|
|
|
mean value: 0.7515810276679842
|
|
|
|
key: train_accuracy
|
|
value: [0.98492462 0.97487437 0.995 0.995 0.99 0.995
|
|
1. 0.985 1. 0.99 ]
|
|
|
|
mean value: 0.9909798994974874
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.76190476 0.81818182 0.6 0.76190476 0.66666667
|
|
0.77777778 0.69565217 0.8 0.69565217]
|
|
|
|
mean value: 0.7434882991404731
|
|
|
|
key: train_fscore
|
|
value: [0.98492462 0.97435897 0.99502488 0.99497487 0.98989899 0.99502488
|
|
1. 0.98477157 1. 0.98989899]
|
|
|
|
mean value: 0.9908877776492233
|
|
|
|
key: test_precision
|
|
value: [0.9 0.88888889 0.81818182 0.66666667 0.8 0.7
|
|
1. 0.66666667 0.71428571 0.66666667]
|
|
|
|
mean value: 0.7821356421356421
|
|
|
|
key: train_precision
|
|
value: [0.98989899 0.98958333 0.99009901 1. 1. 0.99009901
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9959680343034304
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.66666667 0.81818182 0.54545455 0.72727273 0.63636364
|
|
0.63636364 0.72727273 0.90909091 0.72727273]
|
|
|
|
mean value: 0.7212121212121212
|
|
|
|
key: train_recall
|
|
value: [0.98 0.95959596 1. 0.99 0.98 1.
|
|
1. 0.97 1. 0.98 ]
|
|
|
|
mean value: 0.9859595959595959
|
|
|
|
key: test_roc_auc
|
|
value: [0.86742424 0.78787879 0.81818182 0.63636364 0.77272727 0.68181818
|
|
0.81818182 0.68181818 0.77272727 0.68181818]
|
|
|
|
mean value: 0.7518939393939393
|
|
|
|
key: train_roc_auc
|
|
value: [0.98494949 0.97479798 0.995 0.995 0.99 0.995
|
|
1. 0.985 1. 0.99 ]
|
|
|
|
mean value: 0.9909747474747475
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.61538462 0.69230769 0.42857143 0.61538462 0.5
|
|
0.63636364 0.53333333 0.66666667 0.53333333]
|
|
|
|
mean value: 0.5971345321345322
|
|
|
|
key: train_jcc
|
|
value: [0.97029703 0.95 0.99009901 0.99 0.98 0.99009901
|
|
1. 0.97 1. 0.98 ]
|
|
|
|
mean value: 0.982049504950495
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02285457 0.00959587 0.01455879 0.0161109 0.01507545 0.01554322
|
|
0.01422286 0.01510739 0.01378608 0.01424479]
|
|
|
|
mean value: 0.015109992027282715
|
|
|
|
key: score_time
|
|
value: [0.0105598 0.00911903 0.01251507 0.01496482 0.01408434 0.01389599
|
|
0.01317477 0.01339459 0.01391745 0.01329112]
|
|
|
|
mean value: 0.012891697883605956
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91605722 0.56694671 0.46225016 0.56694671 0.54772256
|
|
0.56694671 0.54772256 0. 0.27272727]
|
|
|
|
mean value: 0.49245926319015676
|
|
|
|
key: train_mcc
|
|
value: [0.60824753 0.56783514 0.61076393 0.62111902 0.60048058 0.56
|
|
0.61076393 0.60012004 0.59073889 0.55024767]
|
|
|
|
mean value: 0.5920316725564911
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 0.77272727 0.72727273 0.77272727 0.77272727
|
|
0.77272727 0.77272727 0.5 0.63636364]
|
|
|
|
mean value: 0.742292490118577
|
|
|
|
key: train_accuracy
|
|
value: [0.8040201 0.7839196 0.805 0.81 0.8 0.78 0.805
|
|
0.8 0.795 0.775 ]
|
|
|
|
mean value: 0.7957939698492462
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.96 0.8 0.7 0.73684211 0.76190476
|
|
0.73684211 0.7826087 0.47619048 0.63636364]
|
|
|
|
mean value: 0.7318024507910091
|
|
|
|
key: train_fscore
|
|
value: [0.80788177 0.78172589 0.8 0.81553398 0.79591837 0.78
|
|
0.8 0.7979798 0.8 0.77832512]
|
|
|
|
mean value: 0.7957364930785858
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.92307692 0.71428571 0.77777778 0.875 0.8
|
|
0.875 0.75 0.5 0.63636364]
|
|
|
|
mean value: 0.7578776778776779
|
|
|
|
key: train_precision
|
|
value: [0.7961165 0.78571429 0.82105263 0.79245283 0.8125 0.78
|
|
0.82105263 0.80612245 0.78095238 0.76699029]
|
|
|
|
mean value: 0.7962954005109337
|
|
|
|
key: test_recall
|
|
value: [0.72727273 1. 0.90909091 0.63636364 0.63636364 0.72727273
|
|
0.63636364 0.81818182 0.45454545 0.63636364]
|
|
|
|
mean value: 0.7181818181818181
|
|
|
|
key: train_recall
|
|
value: [0.82 0.77777778 0.78 0.84 0.78 0.78
|
|
0.78 0.79 0.82 0.79 ]
|
|
|
|
mean value: 0.7957777777777778
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95454545 0.77272727 0.72727273 0.77272727 0.77272727
|
|
0.77272727 0.77272727 0.5 0.63636364]
|
|
|
|
mean value: 0.7420454545454546
|
|
|
|
key: train_roc_auc
|
|
value: [0.80393939 0.78388889 0.805 0.81 0.8 0.78
|
|
0.805 0.8 0.795 0.775 ]
|
|
|
|
mean value: 0.7957828282828283
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.92307692 0.66666667 0.53846154 0.58333333 0.61538462
|
|
0.58333333 0.64285714 0.3125 0.46666667]
|
|
|
|
mean value: 0.5903708791208792
|
|
|
|
key: train_jcc
|
|
value: [0.67768595 0.64166667 0.66666667 0.68852459 0.66101695 0.63934426
|
|
0.66666667 0.66386555 0.66666667 0.63709677]
|
|
|
|
mean value: 0.6609200739103485
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01830077 0.01933861 0.01750112 0.01849151 0.01687765 0.01699734
|
|
0.01950169 0.01714587 0.01839066 0.01669145]
|
|
|
|
mean value: 0.017923665046691895
|
|
|
|
key: score_time
|
|
value: [0.01280808 0.01439786 0.0138402 0.01365709 0.01333284 0.01249266
|
|
0.01363063 0.01358247 0.01361966 0.01381898]
|
|
|
|
mean value: 0.013518047332763673
|
|
|
|
key: test_mcc
|
|
value: [0.65909298 0.91666667 0.61237244 0.2773501 0.68313005 0.37796447
|
|
0.37796447 0.73029674 0.75592895 0.31622777]
|
|
|
|
mean value: 0.5706994635298601
|
|
|
|
key: train_mcc
|
|
value: [0.69567269 0.82317181 0.63910148 0.70471677 0.66245673 0.83070192
|
|
0.67028006 0.77908775 0.83710367 0.4843221 ]
|
|
|
|
mean value: 0.7126614996184624
|
|
|
|
key: test_accuracy
|
|
value: [0.82608696 0.95652174 0.77272727 0.63636364 0.81818182 0.68181818
|
|
0.68181818 0.86363636 0.86363636 0.59090909]
|
|
|
|
mean value: 0.7691699604743083
|
|
|
|
key: train_accuracy
|
|
value: [0.82914573 0.90954774 0.79 0.835 0.805 0.91
|
|
0.81 0.88 0.915 0.69 ]
|
|
|
|
mean value: 0.8373693467336684
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.95652174 0.81481481 0.66666667 0.77777778 0.63157895
|
|
0.72 0.85714286 0.88 0.70967742]
|
|
|
|
mean value: 0.7814180222255811
|
|
|
|
key: train_fscore
|
|
value: [0.79761905 0.90425532 0.82644628 0.85714286 0.75776398 0.90217391
|
|
0.84033613 0.86516854 0.92018779 0.76335878]
|
|
|
|
mean value: 0.8434452638934143
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.6875 0.61538462 1. 0.75
|
|
0.64285714 0.9 0.78571429 0.55 ]
|
|
|
|
mean value: 0.7820344932844933
|
|
|
|
key: train_precision
|
|
value: [0.98529412 0.95505618 0.70422535 0.75572519 1. 0.98809524
|
|
0.72463768 0.98717949 0.86725664 0.61728395]
|
|
|
|
mean value: 0.8584753834594282
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 1. 0.72727273 0.63636364 0.54545455
|
|
0.81818182 0.81818182 1. 1. ]
|
|
|
|
mean value: 0.818939393939394
|
|
|
|
key: train_recall
|
|
value: [0.67 0.85858586 1. 0.99 0.61 0.83
|
|
1. 0.77 0.98 1. ]
|
|
|
|
mean value: 0.8708585858585859
|
|
|
|
key: test_roc_auc
|
|
value: [0.8219697 0.95833333 0.77272727 0.63636364 0.81818182 0.68181818
|
|
0.68181818 0.86363636 0.86363636 0.59090909]
|
|
|
|
mean value: 0.7689393939393939
|
|
|
|
key: train_roc_auc
|
|
value: [0.82994949 0.90929293 0.79 0.835 0.805 0.91
|
|
0.81 0.88 0.915 0.69 ]
|
|
|
|
mean value: 0.8374242424242424
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.91666667 0.6875 0.5 0.63636364 0.46153846
|
|
0.5625 0.75 0.78571429 0.55 ]
|
|
|
|
mean value: 0.6516949716949717
|
|
|
|
key: train_jcc
|
|
value: [0.66336634 0.82524272 0.70422535 0.75 0.61 0.82178218
|
|
0.72463768 0.76237624 0.85217391 0.61728395]
|
|
|
|
mean value: 0.7331088367854708
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01732206 0.01706791 0.01785231 0.01656723 0.01566243 0.01723099
|
|
0.01307607 0.01557326 0.01516747 0.02194858]
|
|
|
|
mean value: 0.016746830940246583
|
|
|
|
key: score_time
|
|
value: [0.01364207 0.01363683 0.01368928 0.01371098 0.01355767 0.01361108
|
|
0.012398 0.01225924 0.01247382 0.01716232]
|
|
|
|
mean value: 0.013614130020141602
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.82575758 0.75592895 0.20412415 0.68313005 0.45454545
|
|
0.81818182 0.31622777 0.31622777 0.48795004]
|
|
|
|
mean value: 0.5339346286579878
|
|
|
|
key: train_mcc
|
|
value: [0.93007986 0.72825731 0.89040077 0.79676132 0.63800912 0.87354505
|
|
0.7726195 0.43643578 0.35156152 0.88070485]
|
|
|
|
mean value: 0.7298375083788813
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.91304348 0.86363636 0.59090909 0.81818182 0.72727273
|
|
0.90909091 0.59090909 0.59090909 0.72727273]
|
|
|
|
mean value: 0.7470355731225297
|
|
|
|
key: train_accuracy
|
|
value: [0.96482412 0.84924623 0.945 0.895 0.8 0.935
|
|
0.88 0.66 0.61 0.94 ]
|
|
|
|
mean value: 0.8479070351758794
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.91666667 0.88 0.47058824 0.84615385 0.72727273
|
|
0.90909091 0.70967742 0.70967742 0.76923077]
|
|
|
|
mean value: 0.7665630719691441
|
|
|
|
key: train_fscore
|
|
value: [0.96446701 0.86725664 0.94581281 0.88770053 0.82905983 0.93779904
|
|
0.88990826 0.74626866 0.71942446 0.94117647]
|
|
|
|
mean value: 0.8728873701624488
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.91666667 0.78571429 0.66666667 0.73333333 0.72727273
|
|
0.90909091 0.55 0.55 0.66666667]
|
|
|
|
mean value: 0.7232683982683983
|
|
|
|
key: train_precision
|
|
value: [0.97938144 0.77165354 0.93203883 0.95402299 0.7238806 0.89908257
|
|
0.8220339 0.5952381 0.56179775 0.92307692]
|
|
|
|
mean value: 0.8162206645314616
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 1. 0.36363636 1. 0.72727273
|
|
0.90909091 1. 1. 0.90909091]
|
|
|
|
mean value: 0.8553030303030303
|
|
|
|
key: train_recall
|
|
value: [0.95 0.98989899 0.96 0.83 0.97 0.98
|
|
0.97 1. 1. 0.96 ]
|
|
|
|
mean value: 0.960989898989899
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.91287879 0.86363636 0.59090909 0.81818182 0.72727273
|
|
0.90909091 0.59090909 0.59090909 0.72727273]
|
|
|
|
mean value: 0.746969696969697
|
|
|
|
key: train_roc_auc
|
|
value: [0.96489899 0.84994949 0.945 0.895 0.8 0.935
|
|
0.88 0.66 0.61 0.94 ]
|
|
|
|
mean value: 0.8479848484848485
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.84615385 0.78571429 0.30769231 0.73333333 0.57142857
|
|
0.83333333 0.55 0.55 0.625 ]
|
|
|
|
mean value: 0.6374084249084249
|
|
|
|
key: train_jcc
|
|
value: [0.93137255 0.765625 0.89719626 0.79807692 0.7080292 0.88288288
|
|
0.80165289 0.5952381 0.56179775 0.88888889]
|
|
|
|
mean value: 0.7830760443239905
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14788842 0.12276125 0.11564207 0.11369634 0.17145896 0.15707755
|
|
0.17124152 0.11453056 0.11493278 0.1138854 ]
|
|
|
|
mean value: 0.13431148529052733
|
|
|
|
key: score_time
|
|
value: [0.01740265 0.01460624 0.01493835 0.01787996 0.02255607 0.02203274
|
|
0.01534486 0.0149529 0.01473022 0.01477933]
|
|
|
|
mean value: 0.01692233085632324
|
|
|
|
key: test_mcc
|
|
value: [0.82575758 0.83971912 1. 0.64715023 0.81818182 0.46225016
|
|
0.73029674 0.73029674 0.73029674 0.54772256]
|
|
|
|
mean value: 0.7331671696675314
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91304348 0.91304348 1. 0.81818182 0.90909091 0.72727273
|
|
0.86363636 0.86363636 0.86363636 0.77272727]
|
|
|
|
mean value: 0.8644268774703557
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.90909091 1. 0.83333333 0.90909091 0.75
|
|
0.85714286 0.86956522 0.86956522 0.76190476]
|
|
|
|
mean value: 0.8668784114436289
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90909091 1. 1. 0.76923077 0.90909091 0.69230769
|
|
0.9 0.83333333 0.83333333 0.8 ]
|
|
|
|
mean value: 0.8646386946386947
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.83333333 1. 0.90909091 0.90909091 0.81818182
|
|
0.81818182 0.90909091 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8742424242424243
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91287879 0.91666667 1. 0.81818182 0.90909091 0.72727273
|
|
0.86363636 0.86363636 0.86363636 0.77272727]
|
|
|
|
mean value: 0.8647727272727272
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.83333333 1. 0.71428571 0.83333333 0.6
|
|
0.75 0.76923077 0.76923077 0.61538462]
|
|
|
|
mean value: 0.7718131868131868
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04062152 0.0381217 0.04360723 0.03369117 0.057724 0.04869604
|
|
0.0588851 0.06322312 0.04744148 0.04814744]
|
|
|
|
mean value: 0.0480158805847168
|
|
|
|
key: score_time
|
|
value: [0.01854205 0.01946282 0.01998758 0.02527785 0.03004384 0.02557468
|
|
0.03072119 0.02437186 0.02554321 0.02506495]
|
|
|
|
mean value: 0.024459004402160645
|
|
|
|
key: test_mcc
|
|
value: [0.74242424 0.76764947 0.91287093 0.73029674 0.81818182 0.45454545
|
|
0.83205029 0.54772256 0.73029674 0.75592895]
|
|
|
|
mean value: 0.7291967202447438
|
|
|
|
key: train_mcc
|
|
value: [0.95979798 0.96056672 0.96076892 0.9900495 0.96 0.96
|
|
0.98 0.97043679 0.96019206 0.97043679]
|
|
|
|
mean value: 0.9672248773218128
|
|
|
|
key: test_accuracy
|
|
value: [0.86956522 0.86956522 0.95454545 0.86363636 0.90909091 0.72727273
|
|
0.90909091 0.77272727 0.86363636 0.86363636]
|
|
|
|
mean value: 0.8602766798418973
|
|
|
|
key: train_accuracy
|
|
value: [0.9798995 0.9798995 0.98 0.995 0.98 0.98 0.99
|
|
0.985 0.98 0.985 ]
|
|
|
|
mean value: 0.9834798994974874
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.85714286 0.95238095 0.85714286 0.90909091 0.72727273
|
|
0.9 0.7826087 0.86956522 0.84210526]
|
|
|
|
mean value: 0.856687469662298
|
|
|
|
key: train_fscore
|
|
value: [0.98 0.97938144 0.97959184 0.99497487 0.98 0.98
|
|
0.99 0.98477157 0.97979798 0.98477157]
|
|
|
|
mean value: 0.9833289281411624
|
|
|
|
key: test_precision
|
|
value: [0.83333333 1. 1. 0.9 0.90909091 0.72727273
|
|
1. 0.75 0.83333333 1. ]
|
|
|
|
mean value: 0.8953030303030303
|
|
|
|
key: train_precision
|
|
value: [0.98 1. 1. 1. 0.98 0.98
|
|
0.99 1. 0.98979592 1. ]
|
|
|
|
mean value: 0.9919795918367347
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.75 0.90909091 0.81818182 0.90909091 0.72727273
|
|
0.81818182 0.81818182 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8295454545454546
|
|
|
|
key: train_recall
|
|
value: [0.98 0.95959596 0.96 0.99 0.98 0.98
|
|
0.99 0.97 0.97 0.97 ]
|
|
|
|
mean value: 0.9749595959595959
|
|
|
|
key: test_roc_auc
|
|
value: [0.87121212 0.875 0.95454545 0.86363636 0.90909091 0.72727273
|
|
0.90909091 0.77272727 0.86363636 0.86363636]
|
|
|
|
mean value: 0.8609848484848485
|
|
|
|
key: train_roc_auc
|
|
value: [0.97989899 0.97979798 0.98 0.995 0.98 0.98
|
|
0.99 0.985 0.98 0.985 ]
|
|
|
|
mean value: 0.983469696969697
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.75 0.90909091 0.75 0.83333333 0.57142857
|
|
0.81818182 0.64285714 0.76923077 0.72727273]
|
|
|
|
mean value: 0.7540626040626041
|
|
|
|
key: train_jcc
|
|
value: [0.96078431 0.95959596 0.96 0.99 0.96078431 0.96078431
|
|
0.98019802 0.97 0.96039604 0.97 ]
|
|
|
|
mean value: 0.9672542960178371
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17059803 0.11714387 0.06824756 0.07706714 0.07887411 0.07943869
|
|
0.07819271 0.09146166 0.07394814 0.03088427]
|
|
|
|
mean value: 0.08658561706542969
|
|
|
|
key: score_time
|
|
value: [0.02730536 0.01919198 0.0194664 0.02080297 0.02410865 0.02245116
|
|
0.01826978 0.02425981 0.01341414 0.01849532]
|
|
|
|
mean value: 0.02077655792236328
|
|
|
|
key: test_mcc
|
|
value: [0.41096386 0.6992059 0.46225016 0.18898224 0.37796447 0.18257419
|
|
0.63636364 0.18257419 0.36514837 0.18257419]
|
|
|
|
mean value: 0.3688601196946537
|
|
|
|
key: train_mcc
|
|
value: [0.99 0.98999899 0.9900495 0.9900495 0.9900495 1.
|
|
0.9900495 0.9900495 0.9900495 0.9900495 ]
|
|
|
|
mean value: 0.9910345520838135
|
|
|
|
key: test_accuracy
|
|
value: [0.69565217 0.82608696 0.72727273 0.59090909 0.68181818 0.59090909
|
|
0.81818182 0.59090909 0.68181818 0.59090909]
|
|
|
|
mean value: 0.6794466403162055
|
|
|
|
key: train_accuracy
|
|
value: [0.99497487 0.99497487 0.995 0.995 0.995 1.
|
|
0.995 0.995 0.995 0.995 ]
|
|
|
|
mean value: 0.9954949748743719
|
|
|
|
key: test_fscore
|
|
value: [0.72 0.8 0.7 0.52631579 0.63157895 0.60869565
|
|
0.81818182 0.60869565 0.66666667 0.57142857]
|
|
|
|
mean value: 0.6651563097466988
|
|
|
|
key: train_fscore
|
|
value: [0.99497487 0.99492386 0.99497487 0.99497487 0.99497487 1.
|
|
0.99497487 0.99497487 0.99497487 0.99497487]
|
|
|
|
mean value: 0.9954722852842894
|
|
|
|
key: test_precision
|
|
value: [0.64285714 1. 0.77777778 0.625 0.75 0.58333333
|
|
0.81818182 0.58333333 0.7 0.6 ]
|
|
|
|
mean value: 0.7080483405483405
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.66666667 0.63636364 0.45454545 0.54545455 0.63636364
|
|
0.81818182 0.63636364 0.63636364 0.54545455]
|
|
|
|
mean value: 0.6393939393939394
|
|
|
|
key: train_recall
|
|
value: [0.99 0.98989899 0.99 0.99 0.99 1.
|
|
0.99 0.99 0.99 0.99 ]
|
|
|
|
mean value: 0.990989898989899
|
|
|
|
key: test_roc_auc
|
|
value: [0.70075758 0.83333333 0.72727273 0.59090909 0.68181818 0.59090909
|
|
0.81818182 0.59090909 0.68181818 0.59090909]
|
|
|
|
mean value: 0.6806818181818182
|
|
|
|
key: train_roc_auc
|
|
value: [0.995 0.99494949 0.995 0.995 0.995 1.
|
|
0.995 0.995 0.995 0.995 ]
|
|
|
|
mean value: 0.9954949494949495
|
|
|
|
key: test_jcc
|
|
value: [0.5625 0.66666667 0.53846154 0.35714286 0.46153846 0.4375
|
|
0.69230769 0.4375 0.5 0.4 ]
|
|
|
|
mean value: 0.5053617216117217
|
|
|
|
key: train_jcc
|
|
value: [0.99 0.98989899 0.99 0.99 0.99 1.
|
|
0.99 0.99 0.99 0.99 ]
|
|
|
|
mean value: 0.990989898989899
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.38503289 0.49508715 0.3759234 0.38898635 0.38849163 0.45909381
|
|
0.41848111 0.39247417 0.40462685 0.38324094]
|
|
|
|
mean value: 0.40914382934570315
|
|
|
|
key: score_time
|
|
value: [0.01456547 0.01054454 0.00914955 0.01094794 0.0096848 0.01354051
|
|
0.01003337 0.01007342 0.00932431 0.01131916]
|
|
|
|
mean value: 0.010918307304382324
|
|
|
|
key: test_mcc
|
|
value: [0.74242424 0.76764947 0.83205029 0.73029674 0.73029674 0.46225016
|
|
0.83205029 0.73029674 0.73029674 0.75592895]
|
|
|
|
mean value: 0.7313540387579033
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.86956522 0.86956522 0.90909091 0.86363636 0.86363636 0.72727273
|
|
0.90909091 0.86363636 0.86363636 0.86363636]
|
|
|
|
mean value: 0.8602766798418973
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.85714286 0.9 0.85714286 0.85714286 0.75
|
|
0.9 0.86956522 0.86956522 0.84210526]
|
|
|
|
mean value: 0.857222948676038
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.83333333 1. 1. 0.9 0.9 0.69230769
|
|
1. 0.83333333 0.83333333 1. ]
|
|
|
|
mean value: 0.8992307692307693
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.75 0.81818182 0.81818182 0.81818182 0.81818182
|
|
0.81818182 0.90909091 0.90909091 0.72727273]
|
|
|
|
mean value: 0.8295454545454546
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.87121212 0.875 0.90909091 0.86363636 0.86363636 0.72727273
|
|
0.90909091 0.86363636 0.86363636 0.86363636]
|
|
|
|
mean value: 0.8609848484848485
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.75 0.81818182 0.75 0.75 0.6
|
|
0.81818182 0.76923077 0.76923077 0.72727273]
|
|
|
|
mean value: 0.7521328671328672
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.66
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03351307 0.04224443 0.04674196 0.04314137 0.04148889 0.04995418
|
|
0.06079268 0.07106638 0.06821966 0.14230299]
|
|
|
|
mean value: 0.05994656085968018
|
|
|
|
key: score_time
|
|
value: [0.0161593 0.01277709 0.01891351 0.01652718 0.01363158 0.01789951
|
|
0.01280761 0.01269722 0.02262688 0.01470828]
|
|
|
|
mean value: 0.015874814987182618
|
|
|
|
key: test_mcc
|
|
value: [ 0.03816905 0.25495628 -0.09090909 0. 0.10846523 0.18257419
|
|
0.2773501 -0.23570226 0.37796447 0.10846523]
|
|
|
|
mean value: 0.10213331963023858
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.52173913 0.60869565 0.45454545 0.5 0.54545455 0.59090909
|
|
0.63636364 0.40909091 0.68181818 0.54545455]
|
|
|
|
mean value: 0.5494071146245059
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.47619048 0.52631579 0.45454545 0.52173913 0.375 0.60869565
|
|
0.6 0.13333333 0.63157895 0.375 ]
|
|
|
|
mean value: 0.4702398783520065
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.71428571 0.45454545 0.5 0.6 0.58333333
|
|
0.66666667 0.25 0.75 0.6 ]
|
|
|
|
mean value: 0.5618831168831169
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.45454545 0.41666667 0.45454545 0.54545455 0.27272727 0.63636364
|
|
0.54545455 0.09090909 0.54545455 0.27272727]
|
|
|
|
mean value: 0.42348484848484846
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.51893939 0.61742424 0.45454545 0.5 0.54545455 0.59090909
|
|
0.63636364 0.40909091 0.68181818 0.54545455]
|
|
|
|
mean value: 0.5499999999999999
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.3125 0.35714286 0.29411765 0.35294118 0.23076923 0.4375
|
|
0.42857143 0.07142857 0.46153846 0.23076923]
|
|
|
|
mean value: 0.3177278603749192
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03369212 0.02714562 0.03655577 0.0427351 0.0424602 0.03337955
|
|
0.02429128 0.01469707 0.01462412 0.01471472]
|
|
|
|
mean value: 0.028429555892944335
|
|
|
|
key: score_time
|
|
value: [0.03666997 0.02219319 0.03791785 0.0199151 0.03027272 0.02427435
|
|
0.0124104 0.01242089 0.01250958 0.01247907]
|
|
|
|
mean value: 0.022106313705444337
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91666667 0.54772256 0.2773501 0.73029674 0.46225016
|
|
0.73029674 0.73029674 0.64715023 0.63636364]
|
|
|
|
mean value: 0.6155666308391934
|
|
|
|
key: train_mcc
|
|
value: [0.91071836 0.88983239 0.92166048 0.94018806 0.89040077 0.94018806
|
|
0.90162439 0.90072087 0.90072087 0.91040978]
|
|
|
|
mean value: 0.9106464009703785
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 0.77272727 0.63636364 0.86363636 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8059288537549407
|
|
|
|
key: train_accuracy
|
|
value: [0.95477387 0.94472362 0.96 0.97 0.945 0.97
|
|
0.95 0.95 0.95 0.955 ]
|
|
|
|
mean value: 0.9549497487437185
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.95652174 0.7826087 0.6 0.86956522 0.7
|
|
0.85714286 0.86956522 0.83333333 0.81818182]
|
|
|
|
mean value: 0.8014191605495953
|
|
|
|
key: train_fscore
|
|
value: [0.95384615 0.94358974 0.95876289 0.96969697 0.94416244 0.96969697
|
|
0.94845361 0.94897959 0.94897959 0.95431472]
|
|
|
|
mean value: 0.9540482672709073
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.75 0.66666667 0.83333333 0.77777778
|
|
0.9 0.83333333 0.76923077 0.81818182]
|
|
|
|
mean value: 0.8075796425796427
|
|
|
|
key: train_precision
|
|
value: [0.97894737 0.95833333 0.9893617 0.97959184 0.95876289 0.97959184
|
|
0.9787234 0.96875 0.96875 0.96907216]
|
|
|
|
mean value: 0.9729884533153145
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 0.81818182 0.54545455 0.90909091 0.63636364
|
|
0.81818182 0.90909091 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8007575757575758
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:176: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:179: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
|
|
key: train_recall
|
|
value: [0.93 0.92929293 0.93 0.96 0.93 0.96
|
|
0.92 0.93 0.93 0.94 ]
|
|
|
|
mean value: 0.9359292929292929
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95833333 0.77272727 0.63636364 0.86363636 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.81818182]
|
|
|
|
mean value: 0.806060606060606
|
|
|
|
key: train_roc_auc
|
|
value: [0.95489899 0.94464646 0.96 0.97 0.945 0.97
|
|
0.95 0.95 0.95 0.955 ]
|
|
|
|
mean value: 0.9549545454545455
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.91666667 0.64285714 0.42857143 0.76923077 0.53846154
|
|
0.75 0.76923077 0.71428571 0.69230769]
|
|
|
|
mean value: 0.6793040293040293
|
|
|
|
key: train_jcc
|
|
value: [0.91176471 0.89320388 0.92079208 0.94117647 0.89423077 0.94117647
|
|
0.90196078 0.90291262 0.90291262 0.91262136]
|
|
|
|
mean value: 0.9122751765248133
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.46009588 0.26499319 0.54566193 0.41685939 0.44179821 0.32088614
|
|
0.25617456 0.66312885 0.47220087 0.45935988]
|
|
|
|
mean value: 0.4301158905029297
|
|
|
|
key: score_time
|
|
value: [0.02258277 0.02649212 0.03821397 0.03347635 0.02990937 0.011235
|
|
0.02311802 0.02083945 0.03014112 0.02245569]
|
|
|
|
mean value: 0.025846385955810548
|
|
|
|
key: test_mcc
|
|
value: [0.47727273 0.91666667 0.54772256 0.2773501 0.73029674 0.46225016
|
|
0.73029674 0.73029674 0.64715023 0.63636364]
|
|
|
|
mean value: 0.6155666308391934
|
|
|
|
key: train_mcc
|
|
value: [0.76922303 0.88983239 0.92166048 0.96019206 0.89040077 0.94018806
|
|
0.90162439 0.90072087 0.90072087 0.91040978]
|
|
|
|
mean value: 0.8984972680700725
|
|
|
|
key: test_accuracy
|
|
value: [0.73913043 0.95652174 0.77272727 0.63636364 0.86363636 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8059288537549407
|
|
|
|
key: train_accuracy
|
|
value: [0.88442211 0.94472362 0.96 0.98 0.945 0.97
|
|
0.95 0.95 0.95 0.955 ]
|
|
|
|
mean value: 0.9489145728643216
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.95652174 0.7826087 0.6 0.86956522 0.7
|
|
0.85714286 0.86956522 0.83333333 0.81818182]
|
|
|
|
mean value: 0.8014191605495953
|
|
|
|
key: train_fscore
|
|
value: [0.88324873 0.94358974 0.95876289 0.97979798 0.94416244 0.96969697
|
|
0.94845361 0.94897959 0.94897959 0.95431472]
|
|
|
|
mean value: 0.9479986259928396
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.75 0.66666667 0.83333333 0.77777778
|
|
0.9 0.83333333 0.76923077 0.81818182]
|
|
|
|
mean value: 0.8075796425796427
|
|
|
|
key: train_precision
|
|
value: [0.89690722 0.95833333 0.9893617 0.98979592 0.95876289 0.97959184
|
|
0.9787234 0.96875 0.96875 0.96907216]
|
|
|
|
mean value: 0.965804846285959
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.91666667 0.81818182 0.54545455 0.90909091 0.63636364
|
|
0.81818182 0.90909091 0.90909091 0.81818182]
|
|
|
|
mean value: 0.8007575757575758
|
|
|
|
key: train_recall
|
|
value: [0.87 0.92929293 0.93 0.97 0.93 0.96
|
|
0.92 0.93 0.93 0.94 ]
|
|
|
|
mean value: 0.9309292929292929
|
|
|
|
key: test_roc_auc
|
|
value: [0.73863636 0.95833333 0.77272727 0.63636364 0.86363636 0.72727273
|
|
0.86363636 0.86363636 0.81818182 0.81818182]
|
|
|
|
mean value: 0.806060606060606
|
|
|
|
key: train_roc_auc
|
|
value: [0.88449495 0.94464646 0.96 0.98 0.945 0.97
|
|
0.95 0.95 0.95 0.955 ]
|
|
|
|
mean value: 0.9489141414141414
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.91666667 0.64285714 0.42857143 0.76923077 0.53846154
|
|
0.75 0.76923077 0.71428571 0.69230769]
|
|
|
|
mean value: 0.6793040293040293
|
|
|
|
key: train_jcc
|
|
value: [0.79090909 0.89320388 0.92079208 0.96039604 0.89423077 0.94117647
|
|
0.90196078 0.90291262 0.90291262 0.91262136]
|
|
|
|
mean value: 0.9021115719290596
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11089849 0.12921739 0.10791731 0.14299321 0.28428626 0.17262435
|
|
0.15614367 0.13838696 0.15337706 0.13365555]
|
|
|
|
mean value: 0.15295002460479737
|
|
|
|
key: score_time
|
|
value: [0.01749301 0.02020574 0.019068 0.03575182 0.02044344 0.01739883
|
|
0.02974224 0.02896857 0.0146215 0.01770163]
|
|
|
|
mean value: 0.022139477729797363
|
|
|
|
key: test_mcc
|
|
value: [0.812277 0.78107015 0.84016287 0.82614456 0.73529412 0.78632938
|
|
0.76603235 0.80961181 0.82352941 0.82352941]
|
|
|
|
mean value: 0.8003981065072887
|
|
|
|
key: train_mcc
|
|
value: [0.82559993 0.83701936 0.83389761 0.84190012 0.84039088 0.82747132
|
|
0.83230165 0.82736156 0.84202959 0.83063652]
|
|
|
|
mean value: 0.8338608549501372
|
|
|
|
key: test_accuracy
|
|
value: [0.90510949 0.89051095 0.91970803 0.91240876 0.86764706 0.88970588
|
|
0.88235294 0.90441176 0.91176471 0.91176471]
|
|
|
|
mean value: 0.8995384285100901
|
|
|
|
key: train_accuracy
|
|
value: [0.91279544 0.91850041 0.91687042 0.9209454 0.92019544 0.91368078
|
|
0.91612378 0.91368078 0.92100977 0.91530945]
|
|
|
|
mean value: 0.9169111654441727
|
|
|
|
key: test_fscore
|
|
value: [0.90076336 0.88888889 0.92198582 0.91549296 0.86764706 0.89655172
|
|
0.88571429 0.90225564 0.91176471 0.91176471]
|
|
|
|
mean value: 0.9002829140555026
|
|
|
|
key: train_fscore
|
|
value: [0.9130788 0.91830065 0.91598023 0.92068684 0.92019544 0.91297209
|
|
0.91564292 0.91368078 0.92081633 0.91558442]
|
|
|
|
mean value: 0.9166938482254936
|
|
|
|
key: test_precision
|
|
value: [0.93650794 0.89552239 0.90277778 0.89041096 0.86764706 0.84415584
|
|
0.86111111 0.92307692 0.91176471 0.91176471]
|
|
|
|
mean value: 0.8944739410181639
|
|
|
|
key: train_precision
|
|
value: [0.910859 0.92131148 0.92512479 0.92295082 0.92019544 0.9205298
|
|
0.92092257 0.91368078 0.92307692 0.91262136]
|
|
|
|
mean value: 0.9191272957372615
|
|
|
|
key: test_recall
|
|
value: [0.86764706 0.88235294 0.94202899 0.94202899 0.86764706 0.95588235
|
|
0.91176471 0.88235294 0.91176471 0.91176471]
|
|
|
|
mean value: 0.9075234441602728
|
|
|
|
key: train_recall
|
|
value: [0.91530945 0.91530945 0.90701468 0.91843393 0.92019544 0.90553746
|
|
0.91042345 0.91368078 0.91856678 0.91856678]
|
|
|
|
mean value: 0.9143038189924066
|
|
|
|
key: test_roc_auc
|
|
value: [0.90483802 0.89045183 0.9195439 0.91219096 0.86764706 0.88970588
|
|
0.88235294 0.90441176 0.91176471 0.91176471]
|
|
|
|
mean value: 0.899467178175618
|
|
|
|
key: train_roc_auc
|
|
value: [0.91279339 0.91850301 0.91686239 0.92094335 0.92019544 0.91368078
|
|
0.91612378 0.91368078 0.92100977 0.91530945]
|
|
|
|
mean value: 0.9169102135596282
|
|
|
|
key: test_jcc
|
|
value: [0.81944444 0.8 0.85526316 0.84415584 0.76623377 0.8125
|
|
0.79487179 0.82191781 0.83783784 0.83783784]
|
|
|
|
mean value: 0.819006249149544
|
|
|
|
key: train_jcc
|
|
value: [0.84005979 0.8489426 0.8449848 0.8530303 0.85218703 0.83987915
|
|
0.84441088 0.84107946 0.85325265 0.84431138]
|
|
|
|
mean value: 0.8462138038269915
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [3.5434382 3.34984708 3.27759814 2.81686544 4.5030005 3.56405783
|
|
2.97054791 2.67048979 3.13669658 3.85361886]
|
|
|
|
mean value: 3.3686160326004027
|
|
|
|
key: score_time
|
|
value: [0.0126543 0.02569604 0.02097726 0.02703667 0.02031159 0.01380563
|
|
0.01514602 0.02036047 0.01601267 0.01590753]
|
|
|
|
mean value: 0.018790817260742186
|
|
|
|
key: test_mcc
|
|
value: [0.91277477 0.89869927 0.92709446 0.88654289 0.79967098 0.8623165
|
|
0.83905224 0.8722811 0.82675403 0.94280904]
|
|
|
|
mean value: 0.876799528398134
|
|
|
|
key: train_mcc
|
|
value: [0.89121524 0.9502579 0.89924844 0.95171372 0.95938646 0.95199397
|
|
0.90584507 0.95827005 0.92365982 0.9534379 ]
|
|
|
|
mean value: 0.9345028559218945
|
|
|
|
key: test_accuracy
|
|
value: [0.95620438 0.94890511 0.96350365 0.94160584 0.89705882 0.92647059
|
|
0.91911765 0.93382353 0.91176471 0.97058824]
|
|
|
|
mean value: 0.9369042507513955
|
|
|
|
key: train_accuracy
|
|
value: [0.94539527 0.97473513 0.94947025 0.97555012 0.97964169 0.97557003
|
|
0.95276873 0.97882736 0.96172638 0.97638436]
|
|
|
|
mean value: 0.9670069341021373
|
|
|
|
key: test_fscore
|
|
value: [0.95522388 0.94964029 0.96402878 0.94444444 0.90277778 0.93150685
|
|
0.92086331 0.93706294 0.91549296 0.97142857]
|
|
|
|
mean value: 0.9392469792473013
|
|
|
|
key: train_fscore
|
|
value: [0.94627105 0.97525938 0.95008052 0.97596154 0.97978981 0.97607656
|
|
0.95337621 0.9792 0.96212732 0.97681855]
|
|
|
|
mean value: 0.967496091849667
|
|
|
|
key: test_precision
|
|
value: [0.96969697 0.92957746 0.95714286 0.90666667 0.85526316 0.87179487
|
|
0.90140845 0.89333333 0.87837838 0.94444444]
|
|
|
|
mean value: 0.9107706594845216
|
|
|
|
key: train_precision
|
|
value: [0.93206951 0.95618153 0.93799682 0.95905512 0.97271268 0.95625
|
|
0.94126984 0.96226415 0.95215311 0.95918367]
|
|
|
|
mean value: 0.9529136438683203
|
|
|
|
key: test_recall
|
|
value: [0.94117647 0.97058824 0.97101449 0.98550725 0.95588235 1.
|
|
0.94117647 0.98529412 0.95588235 1. ]
|
|
|
|
mean value: 0.9706521739130435
|
|
|
|
key: train_recall
|
|
value: [0.96091205 0.99511401 0.96247961 0.99347471 0.98697068 0.99674267
|
|
0.96579805 0.99674267 0.9723127 0.99511401]
|
|
|
|
mean value: 0.9825661163392511
|
|
|
|
key: test_roc_auc
|
|
value: [0.95609548 0.94906223 0.96344842 0.94128303 0.89705882 0.92647059
|
|
0.91911765 0.93382353 0.91176471 0.97058824]
|
|
|
|
mean value: 0.9368712702472294
|
|
|
|
key: train_roc_auc
|
|
value: [0.94538262 0.9747185 0.94948085 0.97556472 0.97964169 0.97557003
|
|
0.95276873 0.97882736 0.96172638 0.97638436]
|
|
|
|
mean value: 0.9670065252854813
|
|
|
|
key: test_jcc
|
|
value: [0.91428571 0.90410959 0.93055556 0.89473684 0.82278481 0.87179487
|
|
0.85333333 0.88157895 0.84415584 0.94444444]
|
|
|
|
mean value: 0.8861779952211126
|
|
|
|
key: train_jcc
|
|
value: [0.89802131 0.9517134 0.90490798 0.95305164 0.96038035 0.95327103
|
|
0.9109063 0.95924765 0.92701863 0.9546875 ]
|
|
|
|
mean value: 0.9373205780408035
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02296066 0.01715469 0.01691723 0.01694107 0.01650357 0.01609659
|
|
0.01617169 0.01619339 0.01607752 0.01604986]
|
|
|
|
mean value: 0.01710662841796875
|
|
|
|
key: score_time
|
|
value: [0.02127862 0.01170492 0.01152253 0.01181746 0.01124525 0.01138163
|
|
0.01126766 0.0113914 0.0113976 0.01129961]
|
|
|
|
mean value: 0.012430667877197266
|
|
|
|
key: test_mcc
|
|
value: [0.54864511 0.66746486 0.63690876 0.66496068 0.65417114 0.63406934
|
|
0.58925565 0.57408838 0.65737574 0.71081865]
|
|
|
|
mean value: 0.6337758300707486
|
|
|
|
key: train_mcc
|
|
value: [0.66514107 0.64307836 0.65869844 0.6519476 0.62878522 0.64676878
|
|
0.65690484 0.63901601 0.66318405 0.6395367 ]
|
|
|
|
mean value: 0.649306107344115
|
|
|
|
key: test_accuracy
|
|
value: [0.77372263 0.83211679 0.81751825 0.83211679 0.82352941 0.81617647
|
|
0.79411765 0.78676471 0.82352941 0.85294118]
|
|
|
|
mean value: 0.8152533276084156
|
|
|
|
key: train_accuracy
|
|
value: [0.83211084 0.8207009 0.82885086 0.82559087 0.80863192 0.82247557
|
|
0.82736156 0.81840391 0.83143322 0.81921824]
|
|
|
|
mean value: 0.8234777893700108
|
|
|
|
key: test_fscore
|
|
value: [0.76335878 0.82170543 0.81203008 0.82962963 0.80952381 0.82269504
|
|
0.78787879 0.78195489 0.80645161 0.84375 ]
|
|
|
|
mean value: 0.8078978042785004
|
|
|
|
key: train_fscore
|
|
value: [0.8277592 0.81418919 0.8238255 0.82107023 0.78847885 0.81556684
|
|
0.82003396 0.81053526 0.82878412 0.81375839]
|
|
|
|
mean value: 0.8164001531098434
|
|
|
|
key: test_precision
|
|
value: [0.79365079 0.86885246 0.84375 0.84848485 0.87931034 0.79452055
|
|
0.8125 0.8 0.89285714 0.9 ]
|
|
|
|
mean value: 0.8433926136781971
|
|
|
|
key: train_precision
|
|
value: [0.85051546 0.84561404 0.84801382 0.84219554 0.88128773 0.84859155
|
|
0.85638298 0.84724689 0.84201681 0.83910035]
|
|
|
|
mean value: 0.850096515501237
|
|
|
|
key: test_recall
|
|
value: [0.73529412 0.77941176 0.7826087 0.8115942 0.75 0.85294118
|
|
0.76470588 0.76470588 0.73529412 0.79411765]
|
|
|
|
mean value: 0.7770673486786018
|
|
|
|
key: train_recall
|
|
value: [0.80618893 0.78501629 0.80097879 0.80097879 0.71335505 0.78501629
|
|
0.78664495 0.77687296 0.81596091 0.78990228]
|
|
|
|
mean value: 0.7860915240367499
|
|
|
|
key: test_roc_auc
|
|
value: [0.77344416 0.83173487 0.81777494 0.83226769 0.82352941 0.81617647
|
|
0.79411765 0.78676471 0.82352941 0.85294118]
|
|
|
|
mean value: 0.8152280477408355
|
|
|
|
key: train_roc_auc
|
|
value: [0.83213198 0.82073 0.82882816 0.82557083 0.80863192 0.82247557
|
|
0.82736156 0.81840391 0.83143322 0.81921824]
|
|
|
|
mean value: 0.8234785404190423
|
|
|
|
key: test_jcc
|
|
value: [0.61728395 0.69736842 0.6835443 0.70886076 0.68 0.69879518
|
|
0.65 0.64197531 0.67567568 0.72972973]
|
|
|
|
mean value: 0.6783233329731327
|
|
|
|
key: train_jcc
|
|
value: [0.70613409 0.68660969 0.70042796 0.6964539 0.65081724 0.68857143
|
|
0.69496403 0.68142857 0.70762712 0.68599717]
|
|
|
|
mean value: 0.6899031196349484
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02590728 0.01991701 0.01922178 0.01944113 0.01963425 0.01889205
|
|
0.01906657 0.01980186 0.019485 0.01988626]
|
|
|
|
mean value: 0.020125317573547363
|
|
|
|
key: score_time
|
|
value: [0.0207057 0.01319242 0.01319385 0.01324558 0.01323295 0.01321483
|
|
0.01333451 0.01359987 0.01312637 0.01353455]
|
|
|
|
mean value: 0.01403806209564209
|
|
|
|
key: test_mcc
|
|
value: [0.59494906 0.66746486 0.59205603 0.53314859 0.64733887 0.58430655
|
|
0.4853466 0.63296924 0.51476155 0.66183628]
|
|
|
|
mean value: 0.5914177619460329
|
|
|
|
key: train_mcc
|
|
value: [0.6332577 0.60556753 0.63000061 0.59274722 0.60261865 0.60755713
|
|
0.63849068 0.61889579 0.61728431 0.59446333]
|
|
|
|
mean value: 0.6140882947531848
|
|
|
|
key: test_accuracy
|
|
value: [0.79562044 0.83211679 0.79562044 0.76642336 0.82352941 0.78676471
|
|
0.74264706 0.81617647 0.75735294 0.83088235]
|
|
|
|
mean value: 0.7947133963074281
|
|
|
|
key: train_accuracy
|
|
value: [0.81662592 0.80277099 0.81499593 0.79625102 0.80130293 0.80374593
|
|
0.81921824 0.80944625 0.80863192 0.79723127]
|
|
|
|
mean value: 0.8070220394012037
|
|
|
|
key: test_fscore
|
|
value: [0.78125 0.82170543 0.8028169 0.76470588 0.82608696 0.80536913
|
|
0.74452555 0.82014388 0.75555556 0.83211679]
|
|
|
|
mean value: 0.7954276070370564
|
|
|
|
key: train_fscore
|
|
value: [0.81722177 0.80388979 0.81529699 0.79304636 0.80194805 0.80229696
|
|
0.81803279 0.8091354 0.80940795 0.79706601]
|
|
|
|
mean value: 0.8067342073255446
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.86885246 0.78082192 0.7761194 0.81428571 0.74074074
|
|
0.73913043 0.8028169 0.76119403 0.82608696]
|
|
|
|
mean value: 0.794338189073302
|
|
|
|
key: train_precision
|
|
value: [0.81523501 0.8 0.81331169 0.80504202 0.79935275 0.80826446
|
|
0.82343234 0.81045752 0.80613893 0.79771615]
|
|
|
|
mean value: 0.8078950870261012
|
|
|
|
key: test_recall
|
|
value: [0.73529412 0.77941176 0.82608696 0.75362319 0.83823529 0.88235294
|
|
0.75 0.83823529 0.75 0.83823529]
|
|
|
|
mean value: 0.799147485080989
|
|
|
|
key: train_recall
|
|
value: [0.81921824 0.80781759 0.81729201 0.78140294 0.80456026 0.79641694
|
|
0.81270358 0.80781759 0.81270358 0.79641694]
|
|
|
|
mean value: 0.8056349666030788
|
|
|
|
key: test_roc_auc
|
|
value: [0.79518329 0.83173487 0.79539642 0.76651748 0.82352941 0.78676471
|
|
0.74264706 0.81617647 0.75735294 0.83088235]
|
|
|
|
mean value: 0.7946184995737425
|
|
|
|
key: train_roc_auc
|
|
value: [0.8166238 0.80276687 0.81499779 0.79623893 0.80130293 0.80374593
|
|
0.81921824 0.80944625 0.80863192 0.79723127]
|
|
|
|
mean value: 0.8070203941740042
|
|
|
|
key: test_jcc
|
|
value: [0.64102564 0.69736842 0.67058824 0.61904762 0.7037037 0.6741573
|
|
0.59302326 0.69512195 0.60714286 0.7125 ]
|
|
|
|
mean value: 0.6613678987670822
|
|
|
|
key: train_jcc
|
|
value: [0.69093407 0.67208672 0.68818681 0.65706447 0.66937669 0.66986301
|
|
0.69209431 0.67945205 0.67983651 0.66260163]
|
|
|
|
mean value: 0.676149628585884
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.0179677 0.01519823 0.01462317 0.01480222 0.01445556 0.01456404
|
|
0.01472354 0.01493502 0.01904297 0.01459312]
|
|
|
|
mean value: 0.015490555763244629
|
|
|
|
key: score_time
|
|
value: [0.04223895 0.02163744 0.03045964 0.02881622 0.02916098 0.03041983
|
|
0.03320718 0.03256726 0.03657246 0.03049612]
|
|
|
|
mean value: 0.03155760765075684
|
|
|
|
key: test_mcc
|
|
value: [0.73634276 0.74250909 0.79104463 0.7744776 0.71364124 0.81150267
|
|
0.75139136 0.8131434 0.83258145 0.79405762]
|
|
|
|
mean value: 0.7760691817616058
|
|
|
|
key: train_mcc
|
|
value: [0.86063511 0.85844508 0.85883488 0.85524077 0.86229972 0.85903147
|
|
0.86695884 0.85204744 0.86258777 0.8582323 ]
|
|
|
|
mean value: 0.8594313394535598
|
|
|
|
key: test_accuracy
|
|
value: [0.86131387 0.86861314 0.89051095 0.88321168 0.85294118 0.89705882
|
|
0.86764706 0.90441176 0.91176471 0.88970588]
|
|
|
|
mean value: 0.8827179046801202
|
|
|
|
key: train_accuracy
|
|
value: [0.92665037 0.92583537 0.92583537 0.92339038 0.92833876 0.92508143
|
|
0.92996743 0.9218241 0.92752443 0.92508143]
|
|
|
|
mean value: 0.925952908101909
|
|
|
|
key: test_fscore
|
|
value: [0.87248322 0.875 0.89932886 0.89189189 0.8630137 0.90666667
|
|
0.88 0.90909091 0.91780822 0.89932886]
|
|
|
|
mean value: 0.8914612325055002
|
|
|
|
key: train_fscore
|
|
value: [0.93119266 0.9302682 0.9302682 0.92835366 0.93220339 0.93009119
|
|
0.93415008 0.92694064 0.93200917 0.92987805]
|
|
|
|
mean value: 0.9305355224718177
|
|
|
|
key: test_precision
|
|
value: [0.80246914 0.82894737 0.8375 0.83544304 0.80769231 0.82926829
|
|
0.80487805 0.86666667 0.85897436 0.82716049]
|
|
|
|
mean value: 0.8298999710822114
|
|
|
|
key: train_precision
|
|
value: [0.87752161 0.87843705 0.87716763 0.87124464 0.88450292 0.87179487
|
|
0.88150289 0.87 0.87769784 0.8739255 ]
|
|
|
|
mean value: 0.8763794955944838
|
|
|
|
key: test_recall
|
|
value: [0.95588235 0.92647059 0.97101449 0.95652174 0.92647059 1.
|
|
0.97058824 0.95588235 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9633418584825235
|
|
|
|
key: train_recall
|
|
value: [0.99185668 0.98859935 0.99021207 0.99347471 0.98534202 0.99674267
|
|
0.99348534 0.99185668 0.99348534 0.99348534]
|
|
|
|
mean value: 0.991854020649234
|
|
|
|
key: test_roc_auc
|
|
value: [0.86199915 0.8690324 0.88991901 0.88267263 0.85294118 0.89705882
|
|
0.86764706 0.90441176 0.91176471 0.88970588]
|
|
|
|
mean value: 0.8827152600170503
|
|
|
|
key: train_roc_auc
|
|
value: [0.92659718 0.92578418 0.92588779 0.92344745 0.92833876 0.92508143
|
|
0.92996743 0.9218241 0.92752443 0.92508143]
|
|
|
|
mean value: 0.9259534196640646
|
|
|
|
key: test_jcc
|
|
value: [0.77380952 0.77777778 0.81707317 0.80487805 0.75903614 0.82926829
|
|
0.78571429 0.83333333 0.84810127 0.81707317]
|
|
|
|
mean value: 0.8046065013962848
|
|
|
|
key: train_jcc
|
|
value: [0.87124464 0.86962751 0.86962751 0.86628734 0.87301587 0.86931818
|
|
0.87643678 0.86382979 0.87267525 0.86894587]
|
|
|
|
mean value: 0.8701008732472146
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09757638 0.09589529 0.09379077 0.09431267 0.08427525 0.09253359
|
|
0.09280396 0.0933609 0.09143758 0.09396386]
|
|
|
|
mean value: 0.092995023727417
|
|
|
|
key: score_time
|
|
value: [0.03198242 0.03211188 0.03194284 0.03166461 0.03045869 0.03099656
|
|
0.0306046 0.03076148 0.03069448 0.03143573]
|
|
|
|
mean value: 0.03126533031463623
|
|
|
|
key: test_mcc
|
|
value: [0.76762243 0.79590547 0.75369214 0.78496269 0.76503685 0.78357455
|
|
0.72254413 0.79446135 0.75073095 0.86774089]
|
|
|
|
mean value: 0.7786271446200027
|
|
|
|
key: train_mcc
|
|
value: [0.84380437 0.83865748 0.84037446 0.83212135 0.85204583 0.83555919
|
|
0.82737912 0.83582531 0.84856234 0.82578657]
|
|
|
|
mean value: 0.8380116016387307
|
|
|
|
key: test_accuracy
|
|
value: [0.88321168 0.89781022 0.87591241 0.89051095 0.88235294 0.88970588
|
|
0.86029412 0.89705882 0.875 0.93382353]
|
|
|
|
mean value: 0.88856805495921
|
|
|
|
key: train_accuracy
|
|
value: [0.92176039 0.9193154 0.9201304 0.91605542 0.92589577 0.91775244
|
|
0.91368078 0.91775244 0.9242671 0.91286645]
|
|
|
|
mean value: 0.9189476597405286
|
|
|
|
key: test_fscore
|
|
value: [0.87878788 0.89552239 0.88111888 0.89655172 0.88405797 0.8951049
|
|
0.86524823 0.89552239 0.87769784 0.93430657]
|
|
|
|
mean value: 0.890391876430352
|
|
|
|
key: train_fscore
|
|
value: [0.92282958 0.91970803 0.92071197 0.91619203 0.92679002 0.91821862
|
|
0.91396104 0.9188755 0.92457421 0.91336032]
|
|
|
|
mean value: 0.9195221333056501
|
|
|
|
key: test_precision
|
|
value: [0.90625 0.90909091 0.85135135 0.85526316 0.87142857 0.85333333
|
|
0.83561644 0.90909091 0.85915493 0.92753623]
|
|
|
|
mean value: 0.8778115832007498
|
|
|
|
key: train_precision
|
|
value: [0.91111111 0.91599354 0.91332263 0.91396104 0.91573927 0.91304348
|
|
0.91100324 0.90649762 0.92084006 0.90821256]
|
|
|
|
mean value: 0.9129724551475382
|
|
|
|
key: test_recall
|
|
value: [0.85294118 0.88235294 0.91304348 0.94202899 0.89705882 0.94117647
|
|
0.89705882 0.88235294 0.89705882 0.94117647]
|
|
|
|
mean value: 0.9046248934356351
|
|
|
|
key: train_recall
|
|
value: [0.93485342 0.92345277 0.92822186 0.91843393 0.93811075 0.92345277
|
|
0.91693811 0.93159609 0.92833876 0.91856678]
|
|
|
|
mean value: 0.9261965237444936
|
|
|
|
key: test_roc_auc
|
|
value: [0.88299233 0.89769821 0.87563939 0.89013214 0.88235294 0.88970588
|
|
0.86029412 0.89705882 0.875 0.93382353]
|
|
|
|
mean value: 0.8884697357203751
|
|
|
|
key: train_roc_auc
|
|
value: [0.92174971 0.91931203 0.92013699 0.91605736 0.92589577 0.91775244
|
|
0.91368078 0.91775244 0.9242671 0.91286645]
|
|
|
|
mean value: 0.9189471069285992
|
|
|
|
key: test_jcc
|
|
value: [0.78378378 0.81081081 0.7875 0.8125 0.79220779 0.81012658
|
|
0.7625 0.81081081 0.78205128 0.87671233]
|
|
|
|
mean value: 0.8029003390710084
|
|
|
|
key: train_jcc
|
|
value: [0.85671642 0.85135135 0.85307346 0.84534535 0.86356822 0.8488024
|
|
0.84155456 0.84992571 0.85972851 0.84053651]
|
|
|
|
mean value: 0.8510602473270432
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [ 9.81939363 11.19015908 13.26645947 11.72809553 6.11142302 5.9774859
|
|
11.22838759 8.03849244 6.38793683 2.85526633]
|
|
|
|
mean value: 8.660309982299804
|
|
|
|
key: score_time
|
|
value: [0.01772332 0.02231932 0.02762604 0.01474333 0.01319838 0.02981472
|
|
0.01590967 0.01788568 0.0153389 0.0166378 ]
|
|
|
|
mean value: 0.01911971569061279
|
|
|
|
key: test_mcc
|
|
value: [0.98550725 0.88938138 0.9158731 0.92944673 0.84942274 0.92898531
|
|
0.92898531 0.97100831 0.94280904 0.95681396]
|
|
|
|
mean value: 0.9298233117785435
|
|
|
|
key: train_mcc
|
|
value: [0.9967453 0.93067304 0.99350118 0.99837134 0.99512588 0.98224233
|
|
0.99837266 0.99674796 0.99837266 0.96007955]
|
|
|
|
mean value: 0.9850231903911716
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.94160584 0.95620438 0.96350365 0.91911765 0.96323529
|
|
0.96323529 0.98529412 0.97058824 0.97794118]
|
|
|
|
mean value: 0.9633426363246028
|
|
|
|
key: train_accuracy
|
|
value: [0.99837001 0.96414018 0.99674002 0.999185 0.997557 0.99104235
|
|
0.99918567 0.99837134 0.99918567 0.97964169]
|
|
|
|
mean value: 0.992341892117901
|
|
|
|
key: test_fscore
|
|
value: [0.99270073 0.94444444 0.95833333 0.96503497 0.92517007 0.96453901
|
|
0.96453901 0.98550725 0.97142857 0.97841727]
|
|
|
|
mean value: 0.9650114638943791
|
|
|
|
key: train_fscore
|
|
value: [0.99837398 0.96540881 0.99674797 0.999185 0.99756296 0.99112187
|
|
0.99918633 0.99837398 0.99918633 0.98004789]
|
|
|
|
mean value: 0.9925195119264727
|
|
|
|
key: test_precision
|
|
value: [0.98550725 0.89473684 0.92 0.93243243 0.86075949 0.93150685
|
|
0.93150685 0.97142857 0.94444444 0.95774648]
|
|
|
|
mean value: 0.9330069207961785
|
|
|
|
key: train_precision
|
|
value: [0.99675325 0.9331307 0.99351702 0.99837134 0.99513776 0.9824
|
|
0.99837398 0.99675325 0.99837398 0.96087637]
|
|
|
|
mean value: 0.9853687646105626
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.99275362 0.94202899 0.95588235 0.96323529 0.91911765 0.96323529
|
|
0.96323529 0.98529412 0.97058824 0.97794118]
|
|
|
|
mean value: 0.9633312020460358
|
|
|
|
key: train_roc_auc
|
|
value: [0.99836868 0.96411093 0.99674267 0.99918567 0.997557 0.99104235
|
|
0.99918567 0.99837134 0.99918567 0.97964169]
|
|
|
|
mean value: 0.9923391660600135
|
|
|
|
key: test_jcc
|
|
value: [0.98550725 0.89473684 0.92 0.93243243 0.86075949 0.93150685
|
|
0.93150685 0.97142857 0.94444444 0.95774648]
|
|
|
|
mean value: 0.9330069207961785
|
|
|
|
key: train_jcc
|
|
value: [0.99675325 0.9331307 0.99351702 0.99837134 0.99513776 0.9824
|
|
0.99837398 0.99675325 0.99837398 0.96087637]
|
|
|
|
mean value: 0.9853687646105626
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07515264 0.06080317 0.06919646 0.07247376 0.07183075 0.0628655
|
|
0.06409168 0.06514215 0.06302285 0.06468797]
|
|
|
|
mean value: 0.0669266939163208
|
|
|
|
key: score_time
|
|
value: [0.01082349 0.01040816 0.01027727 0.01023531 0.01051998 0.01028347
|
|
0.01050258 0.0106225 0.01002073 0.0105207 ]
|
|
|
|
mean value: 0.010421419143676757
|
|
|
|
key: test_mcc
|
|
value: [1. 0.87631485 0.9158731 0.88920184 0.90184995 0.94280904
|
|
0.91533482 0.95681396 0.95681396 0.91533482]
|
|
|
|
mean value: 0.9270346350821976
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93430657 0.95620438 0.94160584 0.94852941 0.97058824
|
|
0.95588235 0.97794118 0.97794118 0.95588235]
|
|
|
|
mean value: 0.9618881494203521
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93793103 0.95833333 0.94520548 0.95104895 0.97142857
|
|
0.95774648 0.97841727 0.97841727 0.95774648]
|
|
|
|
mean value: 0.9636274859866248
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88311688 0.92 0.8961039 0.90666667 0.94444444
|
|
0.91891892 0.95774648 0.95774648 0.91891892]
|
|
|
|
mean value: 0.9303662685916207
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.93478261 0.95588235 0.94117647 0.94852941 0.97058824
|
|
0.95588235 0.97794118 0.97794118 0.95588235]
|
|
|
|
mean value: 0.9618606138107417
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88311688 0.92 0.8961039 0.90666667 0.94444444
|
|
0.91891892 0.95774648 0.95774648 0.91891892]
|
|
|
|
mean value: 0.9303662685916207
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.20083737 0.19879031 0.20903301 0.19165325 0.2012167 0.19774413
|
|
0.18660641 0.18971586 0.2091291 0.19987702]
|
|
|
|
mean value: 0.19846031665802003
|
|
|
|
key: score_time
|
|
value: [0.02021098 0.02027822 0.02088189 0.02075434 0.01990175 0.02190304
|
|
0.02135897 0.02068901 0.02141094 0.02544594]
|
|
|
|
mean value: 0.021283507347106934
|
|
|
|
key: test_mcc
|
|
value: [1. 0.98550725 1. 0.97120941 0.97100831 0.98540068
|
|
0.98540068 0.98540068 0.97100831 1. ]
|
|
|
|
mean value: 0.9854935311865046
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.99270073 1. 0.98540146 0.98529412 0.99264706
|
|
0.99264706 0.99264706 0.98529412 1. ]
|
|
|
|
mean value: 0.9926631601545728
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.99270073 1. 0.98571429 0.98550725 0.99270073
|
|
0.99270073 0.99270073 0.98550725 1. ]
|
|
|
|
mean value: 0.9927531698175939
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.98550725 1. 0.97183099 0.97142857 0.98550725
|
|
0.98550725 0.98550725 0.97142857 1. ]
|
|
|
|
mean value: 0.9856717114279883
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.99275362 1. 0.98529412 0.98529412 0.99264706
|
|
0.99264706 0.99264706 0.98529412 1. ]
|
|
|
|
mean value: 0.992657715260017
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.98550725 1. 0.97183099 0.97142857 0.98550725
|
|
0.98550725 0.98550725 0.97142857 1. ]
|
|
|
|
mean value: 0.9856717114279883
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02046251 0.02154493 0.02026796 0.02045822 0.01979375 0.02021337
|
|
0.02100277 0.02029634 0.02117229 0.02066207]
|
|
|
|
mean value: 0.02058742046356201
|
|
|
|
key: score_time
|
|
value: [0.01327848 0.01783276 0.01335645 0.01328135 0.01373529 0.01361465
|
|
0.01344752 0.01323819 0.01321507 0.01345181]
|
|
|
|
mean value: 0.013845157623291016
|
|
|
|
key: test_mcc
|
|
value: [0.94323594 0.94323594 0.90246052 0.87609014 0.91533482 0.83666003
|
|
0.92898531 0.90184995 0.92898531 0.94280904]
|
|
|
|
mean value: 0.9119646995632092
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97080292 0.97080292 0.94890511 0.93430657 0.95588235 0.91176471
|
|
0.96323529 0.94852941 0.96323529 0.97058824]
|
|
|
|
mean value: 0.9538052812365823
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97142857 0.97142857 0.95172414 0.93877551 0.95774648 0.91891892
|
|
0.96453901 0.95104895 0.96453901 0.97142857]
|
|
|
|
mean value: 0.9561577725446336
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.94444444 0.90789474 0.88461538 0.91891892 0.85
|
|
0.93150685 0.90666667 0.93150685 0.94444444]
|
|
|
|
mean value: 0.9164442739006545
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97101449 0.97101449 0.94852941 0.93382353 0.95588235 0.91176471
|
|
0.96323529 0.94852941 0.96323529 0.97058824]
|
|
|
|
mean value: 0.9537617220801364
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94444444 0.94444444 0.90789474 0.88461538 0.91891892 0.85
|
|
0.93150685 0.90666667 0.93150685 0.94444444]
|
|
|
|
mean value: 0.9164442739006545
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [8.2776711 4.21938848 4.72391248 4.15709138 4.41321802 4.4025979
|
|
4.49912047 4.87815547 4.51623201 3.0488472 ]
|
|
|
|
mean value: 4.713623452186584
|
|
|
|
key: score_time
|
|
value: [0.28151059 0.13990307 0.17840743 0.16440082 0.13978744 0.13792515
|
|
0.16863894 0.14216638 0.10375071 0.11206579]
|
|
|
|
mean value: 0.15685563087463378
|
|
|
|
key: test_mcc
|
|
value: [1. 0.95713391 0.97120941 0.95710706 0.94280904 0.97100831
|
|
0.98540068 0.97100831 0.98540068 1. ]
|
|
|
|
mean value: 0.9741077402498759
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97810219 0.98540146 0.97810219 0.97058824 0.98529412
|
|
0.99264706 0.98529412 0.99264706 1. ]
|
|
|
|
mean value: 0.9868076427651353
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.97841727 0.98571429 0.9787234 0.97142857 0.98550725
|
|
0.99270073 0.98550725 0.99270073 1. ]
|
|
|
|
mean value: 0.9870699480192865
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.95774648 0.97183099 0.95833333 0.94444444 0.97142857
|
|
0.98550725 0.97142857 0.98550725 1. ]
|
|
|
|
mean value: 0.9746226878177277
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.97826087 0.98529412 0.97794118 0.97058824 0.98529412
|
|
0.99264706 0.98529412 0.99264706 1. ]
|
|
|
|
mean value: 0.9867966751918158
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.95774648 0.97183099 0.95833333 0.94444444 0.97142857
|
|
0.98550725 0.97142857 0.98550725 1. ]
|
|
|
|
mean value: 0.9746226878177277
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.30284762 1.3088181 1.33434439 1.30789351 1.41369963 1.26666975
|
|
1.26418257 1.34297037 1.33749843 1.32313275]
|
|
|
|
mean value: 1.3202057123184203
|
|
|
|
key: score_time
|
|
value: [0.24020386 0.2401464 0.17477727 0.19462323 0.14892769 0.15167356
|
|
0.25871444 0.14097333 0.13530946 0.15493202]
|
|
|
|
mean value: 0.18402812480926514
|
|
|
|
key: test_mcc
|
|
value: [0.98550418 0.92710997 0.94199209 0.94318882 0.89715584 0.94280904
|
|
0.97100831 0.95598573 0.97100831 0.98540068]
|
|
|
|
mean value: 0.9521162979128164
|
|
|
|
key: train_mcc
|
|
value: [0.98536269 0.98370525 0.98374725 0.99188303 0.98371336 0.98373423
|
|
0.9902753 0.98537469 0.9886636 0.98373423]
|
|
|
|
mean value: 0.9860193630081547
|
|
|
|
key: test_accuracy
|
|
value: [0.99270073 0.96350365 0.97080292 0.97080292 0.94852941 0.97058824
|
|
0.98529412 0.97794118 0.98529412 0.99264706]
|
|
|
|
mean value: 0.9758104336625161
|
|
|
|
key: train_accuracy
|
|
value: [0.99266504 0.99185004 0.99185004 0.99592502 0.99185668 0.99185668
|
|
0.99511401 0.99267101 0.99429967 0.99185668]
|
|
|
|
mean value: 0.9929944861676343
|
|
|
|
key: test_fscore
|
|
value: [0.99259259 0.96350365 0.97142857 0.97183099 0.94890511 0.97142857
|
|
0.98550725 0.97810219 0.98550725 0.99270073]
|
|
|
|
mean value: 0.9761506892950969
|
|
|
|
key: train_fscore
|
|
value: [0.99270073 0.99186992 0.99188312 0.99593826 0.99185668 0.99188312
|
|
0.99513776 0.99270073 0.99433198 0.99188312]
|
|
|
|
mean value: 0.9930185415479755
|
|
|
|
key: test_precision
|
|
value: [1. 0.95652174 0.95774648 0.94520548 0.94202899 0.94444444
|
|
0.97142857 0.97101449 0.97142857 0.98550725]
|
|
|
|
mean value: 0.9645326009394998
|
|
|
|
key: train_precision
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.98869144 0.99025974 0.98707593 0.99190939 0.99185668 0.98867314
|
|
0.99032258 0.98869144 0.98872786 0.98867314]
|
|
|
|
mean value: 0.9894881324676252
|
|
|
|
key: test_recall
|
|
value: [0.98529412 0.97058824 0.98550725 1. 0.95588235 1.
|
|
1. 0.98529412 1. 1. ]
|
|
|
|
mean value: 0.9882566069906223
|
|
|
|
key: train_recall
|
|
value: [0.99674267 0.99348534 0.99673736 1. 0.99185668 0.99511401
|
|
1. 0.99674267 1. 0.99511401]
|
|
|
|
mean value: 0.9965792731852214
|
|
|
|
key: test_roc_auc
|
|
value: [0.99264706 0.96355499 0.9706948 0.97058824 0.94852941 0.97058824
|
|
0.98529412 0.97794118 0.98529412 0.99264706]
|
|
|
|
mean value: 0.9757779198635976
|
|
|
|
key: train_roc_auc
|
|
value: [0.99266171 0.99184871 0.99185402 0.99592834 0.99185668 0.99185668
|
|
0.99511401 0.99267101 0.99429967 0.99185668]
|
|
|
|
mean value: 0.9929947500146128
|
|
|
|
key: test_jcc
|
|
value: [0.98529412 0.92957746 0.94444444 0.94520548 0.90277778 0.94444444
|
|
0.97142857 0.95714286 0.97142857 0.98550725]
|
|
|
|
mean value: 0.9537250974931324
|
|
|
|
key: train_jcc
|
|
value: [0.98550725 0.98387097 0.98389694 0.99190939 0.98384491 0.98389694
|
|
0.99032258 0.98550725 0.98872786 0.98389694]
|
|
|
|
mean value: 0.9861381016950115
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.04272056 0.05943704 0.05153179 0.02936983 0.01935029 0.01928663
|
|
0.01935291 0.01940179 0.01969457 0.02082396]
|
|
|
|
mean value: 0.030096936225891113
|
|
|
|
key: score_time
|
|
value: [0.0132606 0.01356125 0.02673125 0.01406693 0.01334238 0.01322699
|
|
0.01327682 0.01417208 0.01328945 0.0136919 ]
|
|
|
|
mean value: 0.014861965179443359
|
|
|
|
key: test_mcc
|
|
value: [0.59494906 0.66746486 0.59205603 0.53314859 0.64733887 0.58430655
|
|
0.4853466 0.63296924 0.51476155 0.66183628]
|
|
|
|
mean value: 0.5914177619460329
|
|
|
|
key: train_mcc
|
|
value: [0.6332577 0.60556753 0.63000061 0.59274722 0.60261865 0.60755713
|
|
0.63849068 0.61889579 0.61728431 0.59446333]
|
|
|
|
mean value: 0.6140882947531848
|
|
|
|
key: test_accuracy
|
|
value: [0.79562044 0.83211679 0.79562044 0.76642336 0.82352941 0.78676471
|
|
0.74264706 0.81617647 0.75735294 0.83088235]
|
|
|
|
mean value: 0.7947133963074281
|
|
|
|
key: train_accuracy
|
|
value: [0.81662592 0.80277099 0.81499593 0.79625102 0.80130293 0.80374593
|
|
0.81921824 0.80944625 0.80863192 0.79723127]
|
|
|
|
mean value: 0.8070220394012037
|
|
|
|
key: test_fscore
|
|
value: [0.78125 0.82170543 0.8028169 0.76470588 0.82608696 0.80536913
|
|
0.74452555 0.82014388 0.75555556 0.83211679]
|
|
|
|
mean value: 0.7954276070370564
|
|
|
|
key: train_fscore
|
|
value: [0.81722177 0.80388979 0.81529699 0.79304636 0.80194805 0.80229696
|
|
0.81803279 0.8091354 0.80940795 0.79706601]
|
|
|
|
mean value: 0.8067342073255446
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.86885246 0.78082192 0.7761194 0.81428571 0.74074074
|
|
0.73913043 0.8028169 0.76119403 0.82608696]
|
|
|
|
mean value: 0.794338189073302
|
|
|
|
key: train_precision
|
|
value: [0.81523501 0.8 0.81331169 0.80504202 0.79935275 0.80826446
|
|
0.82343234 0.81045752 0.80613893 0.79771615]
|
|
|
|
mean value: 0.8078950870261012
|
|
|
|
key: test_recall
|
|
value: [0.73529412 0.77941176 0.82608696 0.75362319 0.83823529 0.88235294
|
|
0.75 0.83823529 0.75 0.83823529]
|
|
|
|
mean value: 0.799147485080989
|
|
|
|
key: train_recall
|
|
value: [0.81921824 0.80781759 0.81729201 0.78140294 0.80456026 0.79641694
|
|
0.81270358 0.80781759 0.81270358 0.79641694]
|
|
|
|
mean value: 0.8056349666030788
|
|
|
|
key: test_roc_auc
|
|
value: [0.79518329 0.83173487 0.79539642 0.76651748 0.82352941 0.78676471
|
|
0.74264706 0.81617647 0.75735294 0.83088235]
|
|
|
|
mean value: 0.7946184995737425
|
|
|
|
key: train_roc_auc
|
|
value: [0.8166238 0.80276687 0.81499779 0.79623893 0.80130293 0.80374593
|
|
0.81921824 0.80944625 0.80863192 0.79723127]
|
|
|
|
mean value: 0.8070203941740042
|
|
|
|
key: test_jcc
|
|
value: [0.64102564 0.69736842 0.67058824 0.61904762 0.7037037 0.6741573
|
|
0.59302326 0.69512195 0.60714286 0.7125 ]
|
|
|
|
mean value: 0.6613678987670822
|
|
|
|
key: train_jcc
|
|
value: [0.69093407 0.67208672 0.68818681 0.65706447 0.66937669 0.66986301
|
|
0.69209431 0.67945205 0.67983651 0.66260163]
|
|
|
|
mean value: 0.676149628585884
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.3358233 1.51815987 1.29080606 0.34695745 1.41464186 0.21330214
|
|
1.50222182 1.55371165 0.99084353 0.60692477]
|
|
|
|
mean value: 0.977339243888855
|
|
|
|
key: score_time
|
|
value: [0.01254296 0.01423478 0.01312089 0.01385069 0.01297879 0.01231813
|
|
0.01395893 0.01350212 0.02170682 0.01221442]
|
|
|
|
mean value: 0.014042854309082031
|
|
|
|
key: test_mcc
|
|
value: [1. 0.97122151 0.95710706 0.94318882 0.94280904 0.95681396
|
|
0.97100831 0.98540068 0.98540068 0.98540068]
|
|
|
|
mean value: 0.9698350737002243
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98540146 0.97810219 0.97080292 0.97058824 0.97794118
|
|
0.98529412 0.99264706 0.99264706 0.99264706]
|
|
|
|
mean value: 0.9846071275225419
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98550725 0.9787234 0.97183099 0.97142857 0.97841727
|
|
0.98550725 0.99270073 0.99270073 0.99270073]
|
|
|
|
mean value: 0.9849516910321079
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.97142857 0.95833333 0.94520548 0.94444444 0.95774648
|
|
0.97142857 0.98550725 0.98550725 0.98550725]
|
|
|
|
mean value: 0.970510861809065
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98550725 0.97794118 0.97058824 0.97058824 0.97794118
|
|
0.98529412 0.99264706 0.99264706 0.99264706]
|
|
|
|
mean value: 0.9845801364023871
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.97142857 0.95833333 0.94520548 0.94444444 0.95774648
|
|
0.97142857 0.98550725 0.98550725 0.98550725]
|
|
|
|
mean value: 0.970510861809065
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06717873 0.10198259 0.08658075 0.10811377 0.61797523 0.07925892
|
|
0.11409497 0.10496569 0.09061217 0.10234213]
|
|
|
|
mean value: 0.14731049537658691
|
|
|
|
key: score_time
|
|
value: [0.02386189 0.02009273 0.02184725 0.02548599 0.02336287 0.03601003
|
|
0.02148342 0.01982164 0.01296329 0.02166152]
|
|
|
|
mean value: 0.0226590633392334
|
|
|
|
key: test_mcc
|
|
value: [0.812277 0.87099729 0.8251972 0.800926 0.70710678 0.82928843
|
|
0.79549513 0.8131434 0.78632938 0.89715584]
|
|
|
|
mean value: 0.8137916451598168
|
|
|
|
key: train_mcc
|
|
value: [0.84694026 0.87678779 0.87332809 0.86656872 0.88172633 0.87326071
|
|
0.8684111 0.86858635 0.88488253 0.8457717 ]
|
|
|
|
mean value: 0.8686263570909856
|
|
|
|
key: test_accuracy
|
|
value: [0.90510949 0.93430657 0.91240876 0.89781022 0.85294118 0.91176471
|
|
0.89705882 0.90441176 0.88970588 0.94852941]
|
|
|
|
mean value: 0.9054046801202232
|
|
|
|
key: train_accuracy
|
|
value: [0.92339038 0.93806031 0.93643032 0.93317033 0.94055375 0.93648208
|
|
0.93403909 0.93403909 0.94218241 0.92263844]
|
|
|
|
mean value: 0.9340986198163471
|
|
|
|
key: test_fscore
|
|
value: [0.90076336 0.93617021 0.91176471 0.90410959 0.85714286 0.91666667
|
|
0.9 0.90909091 0.89655172 0.94890511]
|
|
|
|
mean value: 0.9081165132995447
|
|
|
|
key: train_fscore
|
|
value: [0.92419355 0.93929712 0.93739968 0.93387097 0.94164668 0.93729904
|
|
0.93493976 0.93514812 0.94315452 0.92393915]
|
|
|
|
mean value: 0.9350888590196927
|
|
|
|
key: test_precision
|
|
value: [0.93650794 0.90410959 0.92537313 0.85714286 0.83333333 0.86842105
|
|
0.875 0.86666667 0.84415584 0.94202899]
|
|
|
|
mean value: 0.8852739399314917
|
|
|
|
key: train_precision
|
|
value: [0.91533546 0.92163009 0.92259084 0.92344498 0.92464678 0.92539683
|
|
0.92234548 0.91968504 0.92755906 0.90866142]
|
|
|
|
mean value: 0.9211295973019243
|
|
|
|
key: test_recall
|
|
value: [0.86764706 0.97058824 0.89855072 0.95652174 0.88235294 0.97058824
|
|
0.92647059 0.95588235 0.95588235 0.95588235]
|
|
|
|
mean value: 0.9340366581415175
|
|
|
|
key: train_recall
|
|
value: [0.93322476 0.95765472 0.95269168 0.94453507 0.95928339 0.9495114
|
|
0.94788274 0.95114007 0.95928339 0.93973941]
|
|
|
|
mean value: 0.9494946623377313
|
|
|
|
key: test_roc_auc
|
|
value: [0.90483802 0.93456948 0.91251066 0.89737852 0.85294118 0.91176471
|
|
0.89705882 0.90441176 0.88970588 0.94852941]
|
|
|
|
mean value: 0.9053708439897699
|
|
|
|
key: train_roc_auc
|
|
value: [0.92338236 0.93804433 0.93644356 0.93317959 0.94055375 0.93648208
|
|
0.93403909 0.93403909 0.94218241 0.92263844]
|
|
|
|
mean value: 0.9340984691085121
|
|
|
|
key: test_jcc
|
|
value: [0.81944444 0.88 0.83783784 0.825 0.75 0.84615385
|
|
0.81818182 0.83333333 0.8125 0.90277778]
|
|
|
|
mean value: 0.8325229057729058
|
|
|
|
key: train_jcc
|
|
value: [0.85907046 0.88554217 0.88217523 0.87594554 0.8897281 0.88199697
|
|
0.87782805 0.87819549 0.89242424 0.85863095]
|
|
|
|
mean value: 0.8781537205877241
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01767254 0.02120185 0.02006078 0.01994205 0.01963353 0.01972198
|
|
0.01966166 0.02000618 0.0196619 0.01966596]
|
|
|
|
mean value: 0.019722843170166017
|
|
|
|
key: score_time
|
|
value: [0.01396465 0.01512766 0.01483417 0.01484108 0.01479626 0.01485062
|
|
0.01475906 0.01479578 0.01479769 0.0147922 ]
|
|
|
|
mean value: 0.014755916595458985
|
|
|
|
key: test_mcc
|
|
value: [0.59324085 0.6380904 0.59804827 0.63690876 0.69125122 0.5905386
|
|
0.54464795 0.61871843 0.56101167 0.67911938]
|
|
|
|
mean value: 0.6151575523836643
|
|
|
|
key: train_mcc
|
|
value: [0.64023262 0.62893946 0.63068241 0.61268574 0.63551639 0.63232786
|
|
0.66325449 0.64177893 0.66140974 0.63849068]
|
|
|
|
mean value: 0.6385318324603495
|
|
|
|
key: test_accuracy
|
|
value: [0.79562044 0.81751825 0.79562044 0.81751825 0.84558824 0.79411765
|
|
0.77205882 0.80882353 0.77941176 0.83823529]
|
|
|
|
mean value: 0.8064512666380421
|
|
|
|
key: train_accuracy
|
|
value: [0.8198859 0.81418093 0.81499593 0.80603097 0.81758958 0.81596091
|
|
0.83143322 0.82084691 0.83061889 0.81921824]
|
|
|
|
mean value: 0.8190761476974374
|
|
|
|
key: test_fscore
|
|
value: [0.78461538 0.80620155 0.78125 0.81203008 0.84444444 0.8028169
|
|
0.77697842 0.8030303 0.76923077 0.83076923]
|
|
|
|
mean value: 0.8011367076340337
|
|
|
|
key: train_fscore
|
|
value: [0.81659751 0.81031614 0.81035923 0.80133556 0.81456954 0.81260365
|
|
0.82850041 0.81937603 0.82866557 0.81803279]
|
|
|
|
mean value: 0.8160356421443249
|
|
|
|
key: test_precision
|
|
value: [0.82258065 0.85245902 0.84745763 0.84375 0.85074627 0.77027027
|
|
0.76056338 0.828125 0.80645161 0.87096774]
|
|
|
|
mean value: 0.8253371562720764
|
|
|
|
key: train_precision
|
|
value: [0.83248731 0.82823129 0.83047945 0.82051282 0.82828283 0.8277027
|
|
0.84317032 0.82615894 0.83833333 0.82343234]
|
|
|
|
mean value: 0.8298791343084553
|
|
|
|
key: test_recall
|
|
value: [0.75 0.76470588 0.72463768 0.7826087 0.83823529 0.83823529
|
|
0.79411765 0.77941176 0.73529412 0.79411765]
|
|
|
|
mean value: 0.7801364023870417
|
|
|
|
key: train_recall
|
|
value: [0.80130293 0.79315961 0.79119086 0.78303426 0.80130293 0.7980456
|
|
0.81433225 0.81270358 0.81921824 0.81270358]
|
|
|
|
mean value: 0.8026993851990797
|
|
|
|
key: test_roc_auc
|
|
value: [0.79528986 0.81713555 0.79614237 0.81777494 0.84558824 0.79411765
|
|
0.77205882 0.80882353 0.77941176 0.83823529]
|
|
|
|
mean value: 0.806457800511509
|
|
|
|
key: train_roc_auc
|
|
value: [0.81990106 0.81419808 0.81497654 0.80601224 0.81758958 0.81596091
|
|
0.83143322 0.82084691 0.83061889 0.81921824]
|
|
|
|
mean value: 0.8190755668443231
|
|
|
|
key: test_jcc
|
|
value: [0.64556962 0.67532468 0.64102564 0.6835443 0.73076923 0.67058824
|
|
0.63529412 0.67088608 0.625 0.71052632]
|
|
|
|
mean value: 0.6688528215850197
|
|
|
|
key: train_jcc
|
|
value: [0.69004208 0.68111888 0.68117978 0.66852368 0.68715084 0.68435754
|
|
0.70721358 0.69401947 0.70745429 0.69209431]
|
|
|
|
mean value: 0.6893154442079789
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03860188 0.04693508 0.0453124 0.03687978 0.04116654 0.03766918
|
|
0.03457999 0.03815031 0.07156754 0.03939486]
|
|
|
|
mean value: 0.043025755882263185
|
|
|
|
key: score_time
|
|
value: [0.01524901 0.01752234 0.02431965 0.01302862 0.01294494 0.01732397
|
|
0.02216196 0.03092146 0.02712727 0.01669025]
|
|
|
|
mean value: 0.019728946685791015
|
|
|
|
key: test_mcc
|
|
value: [0.89863497 0.81027501 0.83947987 0.82066286 0.76503685 0.56666667
|
|
0.7768986 0.71492035 0.79446135 0.83258145]
|
|
|
|
mean value: 0.7819617977261805
|
|
|
|
key: train_mcc
|
|
value: [0.87474955 0.87939986 0.83059333 0.82722888 0.88445984 0.56230531
|
|
0.83579768 0.72591688 0.8681346 0.75795971]
|
|
|
|
mean value: 0.8046545639530074
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.90510949 0.91970803 0.90510949 0.88235294 0.75
|
|
0.88235294 0.83823529 0.89705882 0.91176471]
|
|
|
|
mean value: 0.8840596822670674
|
|
|
|
key: train_accuracy
|
|
value: [0.93724531 0.9396903 0.91524042 0.90872046 0.94218241 0.74348534
|
|
0.91286645 0.84609121 0.93403909 0.87052117]
|
|
|
|
mean value: 0.8950082163269966
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9037037 0.92086331 0.91275168 0.88405797 0.67307692
|
|
0.89189189 0.86075949 0.89855072 0.9047619 ]
|
|
|
|
mean value: 0.8797786021014982
|
|
|
|
key: train_fscore
|
|
value: [0.9380531 0.93954248 0.91585761 0.91515152 0.94260307 0.65798046
|
|
0.9191232 0.86624204 0.93441296 0.85532302]
|
|
|
|
mean value: 0.8884289448756847
|
|
|
|
key: test_precision
|
|
value: [0.96923077 0.91044776 0.91428571 0.85 0.87142857 0.97222222
|
|
0.825 0.75555556 0.88571429 0.98275862]
|
|
|
|
mean value: 0.8936643500320803
|
|
|
|
key: train_precision
|
|
value: [0.92686804 0.94262295 0.90850722 0.854314 0.93579454 0.98697068
|
|
0.85754584 0.76595745 0.92914654 0.96907216]
|
|
|
|
mean value: 0.9076799436662107
|
|
|
|
key: test_recall
|
|
value: [0.92647059 0.89705882 0.92753623 0.98550725 0.89705882 0.51470588
|
|
0.97058824 1. 0.91176471 0.83823529]
|
|
|
|
mean value: 0.8868925831202046
|
|
|
|
key: train_recall
|
|
value: [0.9495114 0.93648208 0.9233279 0.98531811 0.9495114 0.49348534
|
|
0.99022801 0.99674267 0.93973941 0.76547231]
|
|
|
|
mean value: 0.8929818641699124
|
|
|
|
key: test_roc_auc
|
|
value: [0.94874254 0.90505115 0.91965047 0.90451833 0.88235294 0.75
|
|
0.88235294 0.83823529 0.89705882 0.91176471]
|
|
|
|
mean value: 0.8839727195225917
|
|
|
|
key: train_roc_auc
|
|
value: [0.93723531 0.93969292 0.91524701 0.90878283 0.94218241 0.74348534
|
|
0.91286645 0.84609121 0.93403909 0.87052117]
|
|
|
|
mean value: 0.8950143736948101
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.82432432 0.85333333 0.83950617 0.79220779 0.50724638
|
|
0.80487805 0.75555556 0.81578947 0.82608696]
|
|
|
|
mean value: 0.7918928034058543
|
|
|
|
key: train_jcc
|
|
value: [0.88333333 0.88597843 0.84477612 0.84357542 0.89143731 0.49029126
|
|
0.85034965 0.76404494 0.8768997 0.74721781]
|
|
|
|
mean value: 0.8077903967346308
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04438257 0.04041004 0.03820419 0.04464841 0.03713417 0.04714775
|
|
0.04193568 0.03920817 0.04001617 0.03890991]
|
|
|
|
mean value: 0.04119970798492432
|
|
|
|
key: score_time
|
|
value: [0.01795697 0.0130899 0.01262331 0.01278138 0.01264524 0.01248693
|
|
0.0125463 0.01261425 0.0127604 0.01984 ]
|
|
|
|
mean value: 0.013934469223022461
|
|
|
|
key: test_mcc
|
|
value: [0.9158731 0.89869927 0.78803902 0.84660737 0.64549722 0.82402205
|
|
0.80961181 0.69577462 0.82928843 0.84567499]
|
|
|
|
mean value: 0.8099087887121932
|
|
|
|
key: train_mcc
|
|
value: [0.88594196 0.88186074 0.82791629 0.89452489 0.70887969 0.90755723
|
|
0.88782117 0.71461937 0.90692901 0.84856482]
|
|
|
|
mean value: 0.8464615175108487
|
|
|
|
key: test_accuracy
|
|
value: [0.95620438 0.94890511 0.89051095 0.91970803 0.79411765 0.90441176
|
|
0.90441176 0.83088235 0.91176471 0.91911765]
|
|
|
|
mean value: 0.8980034349506226
|
|
|
|
key: train_accuracy
|
|
value: [0.94295029 0.9405053 0.91198044 0.94621027 0.83550489 0.9519544
|
|
0.94381107 0.84527687 0.95276873 0.9218241 ]
|
|
|
|
mean value: 0.9192786356915121
|
|
|
|
key: test_fscore
|
|
value: [0.95384615 0.94964029 0.88372093 0.92517007 0.82926829 0.91275168
|
|
0.90647482 0.8 0.91666667 0.92413793]
|
|
|
|
mean value: 0.9001676828256018
|
|
|
|
key: train_fscore
|
|
value: [0.94327391 0.94183267 0.90737564 0.94794953 0.85834502 0.95401403
|
|
0.94439968 0.82242991 0.9540412 0.92581144]
|
|
|
|
mean value: 0.9199473022076147
|
|
|
|
key: test_precision
|
|
value: [1. 0.92957746 0.95 0.87179487 0.70833333 0.83950617
|
|
0.88732394 0.9787234 0.86842105 0.87012987]
|
|
|
|
mean value: 0.8903810113435183
|
|
|
|
key: train_precision
|
|
value: [0.93870968 0.92199688 0.95660036 0.91755725 0.75369458 0.91479821
|
|
0.93460925 0.96491228 0.92901235 0.88088235]
|
|
|
|
mean value: 0.9112773188146082
|
|
|
|
key: test_recall
|
|
value: [0.91176471 0.97058824 0.82608696 0.98550725 1. 1.
|
|
0.92647059 0.67647059 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9252770673486787
|
|
|
|
key: train_recall
|
|
value: [0.94788274 0.96254072 0.862969 0.98042414 0.99674267 0.99674267
|
|
0.95439739 0.71661238 0.98045603 0.97557003]
|
|
|
|
mean value: 0.9374337773857411
|
|
|
|
key: test_roc_auc
|
|
value: [0.95588235 0.94906223 0.89098465 0.91922421 0.79411765 0.90441176
|
|
0.90441176 0.83088235 0.91176471 0.91911765]
|
|
|
|
mean value: 0.8979859335038363
|
|
|
|
key: train_roc_auc
|
|
value: [0.94294626 0.94048732 0.91194053 0.94623813 0.83550489 0.9519544
|
|
0.94381107 0.84527687 0.95276873 0.9218241 ]
|
|
|
|
mean value: 0.9192752310152983
|
|
|
|
key: test_jcc
|
|
value: [0.91176471 0.90410959 0.79166667 0.86075949 0.70833333 0.83950617
|
|
0.82894737 0.66666667 0.84615385 0.85897436]
|
|
|
|
mean value: 0.8216882201649766
|
|
|
|
key: train_jcc
|
|
value: [0.89263804 0.89006024 0.83045526 0.90104948 0.75184275 0.91207154
|
|
0.89465649 0.6984127 0.91212121 0.8618705 ]
|
|
|
|
mean value: 0.8545178201608485
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.36904597 0.34564185 0.41399765 0.32139015 0.33926725 0.32440686
|
|
0.38246441 0.38453221 0.37067652 0.35346103]
|
|
|
|
mean value: 0.36048839092254636
|
|
|
|
key: score_time
|
|
value: [0.01911259 0.01781178 0.01721931 0.01834321 0.01791716 0.01647162
|
|
0.01970482 0.02053666 0.01990604 0.01659083]
|
|
|
|
mean value: 0.018361401557922364
|
|
|
|
key: test_mcc
|
|
value: [0.95629932 0.90025835 0.94199209 0.92944673 0.8722811 0.91334626
|
|
0.8979331 0.91533482 0.88388348 0.95681396]
|
|
|
|
mean value: 0.9167589213960957
|
|
|
|
key: train_mcc
|
|
value: [0.94968259 0.96455457 0.95301603 0.95813054 0.97253071 0.94640596
|
|
0.95126624 0.9726856 0.9534379 0.97586128]
|
|
|
|
mean value: 0.9597571420489015
|
|
|
|
key: test_accuracy
|
|
value: [0.97810219 0.94890511 0.97080292 0.96350365 0.93382353 0.95588235
|
|
0.94852941 0.95588235 0.94117647 0.97794118]
|
|
|
|
mean value: 0.9574549162730785
|
|
|
|
key: train_accuracy
|
|
value: [0.97473513 0.98207009 0.97636512 0.97881011 0.98615635 0.97312704
|
|
0.97557003 0.98615635 0.97638436 0.98778502]
|
|
|
|
mean value: 0.9797159593192262
|
|
|
|
key: test_fscore
|
|
value: [0.97777778 0.95035461 0.97142857 0.96503497 0.93706294 0.95714286
|
|
0.94964029 0.95774648 0.94285714 0.97841727]
|
|
|
|
mean value: 0.9587462894063403
|
|
|
|
key: train_fscore
|
|
value: [0.97502015 0.9823435 0.97663175 0.97913323 0.98630137 0.97336562
|
|
0.97576737 0.98634538 0.97681855 0.98793242]
|
|
|
|
mean value: 0.9799659321423493
|
|
|
|
key: test_precision
|
|
value: [0.98507463 0.91780822 0.95774648 0.93243243 0.89333333 0.93055556
|
|
0.92957746 0.91891892 0.91666667 0.95774648]
|
|
|
|
mean value: 0.9339860175485872
|
|
|
|
key: train_precision
|
|
value: [0.96491228 0.96835443 0.96496815 0.96366509 0.97607656 0.9648
|
|
0.96794872 0.97305864 0.95918367 0.97615262]
|
|
|
|
mean value: 0.9679120157573049
|
|
|
|
key: test_recall
|
|
value: [0.97058824 0.98529412 0.98550725 1. 0.98529412 0.98529412
|
|
0.97058824 1. 0.97058824 1. ]
|
|
|
|
mean value: 0.9853154305200341
|
|
|
|
key: train_recall
|
|
value: [0.98534202 0.99674267 0.98858075 0.99510604 0.99674267 0.98208469
|
|
0.98371336 1. 0.99511401 1. ]
|
|
|
|
mean value: 0.9923426199977682
|
|
|
|
key: test_roc_auc
|
|
value: [0.97804774 0.9491688 0.9706948 0.96323529 0.93382353 0.95588235
|
|
0.94852941 0.95588235 0.94117647 0.97794118]
|
|
|
|
mean value: 0.9574381926683717
|
|
|
|
key: train_roc_auc
|
|
value: [0.97472647 0.98205812 0.97637507 0.97882338 0.98615635 0.97312704
|
|
0.97557003 0.98615635 0.97638436 0.98778502]
|
|
|
|
mean value: 0.9797162191603211
|
|
|
|
key: test_jcc
|
|
value: [0.95652174 0.90540541 0.94444444 0.93243243 0.88157895 0.91780822
|
|
0.90410959 0.91891892 0.89189189 0.95774648]
|
|
|
|
mean value: 0.9210858066684366
|
|
|
|
key: train_jcc
|
|
value: [0.95125786 0.96529968 0.95433071 0.9591195 0.97297297 0.94811321
|
|
0.95268139 0.97305864 0.9546875 0.97615262]
|
|
|
|
mean value: 0.9607674080522771
|
|
|
|
MCC on Blind test: 0.65
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.22876954 0.08565283 0.19133615 0.09375477 0.20426679 0.21190095
|
|
0.19379854 0.18301129 0.19188833 0.18430281]
|
|
|
|
mean value: 0.17686820030212402
|
|
|
|
key: score_time
|
|
value: [0.04507017 0.02498293 0.02771354 0.02370334 0.02618623 0.02305579
|
|
0.02408266 0.0239079 0.02499819 0.0262382 ]
|
|
|
|
mean value: 0.02699389457702637
|
|
|
|
key: test_mcc
|
|
value: [1. 0.95713391 0.92944673 0.90246052 0.8753478 0.92898531
|
|
0.91533482 0.95681396 0.95681396 0.98540068]
|
|
|
|
mean value: 0.9407737686262707
|
|
|
|
key: train_mcc
|
|
value: [0.99837133 0.99837133 0.99674532 1. 0.99837266 0.99350642
|
|
0.99674796 1. 0.99837266 0.99674796]
|
|
|
|
mean value: 0.9977235643412998
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97810219 0.96350365 0.94890511 0.93382353 0.96323529
|
|
0.95588235 0.97794118 0.97794118 0.99264706]
|
|
|
|
mean value: 0.9691981537140404
|
|
|
|
key: train_accuracy
|
|
value: [0.999185 0.999185 0.99837001 1. 0.99918567 0.99674267
|
|
0.99837134 1. 0.99918567 0.99837134]
|
|
|
|
mean value: 0.9988596693824349
|
|
|
|
key: test_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[1. 0.97841727 0.96503497 0.95172414 0.93793103 0.96453901
|
|
0.95774648 0.97841727 0.97841727 0.99270073]
|
|
|
|
mean value: 0.9704928151902354
|
|
|
|
key: train_fscore
|
|
value: [0.99918633 0.99918633 0.99837134 1. 0.99918633 0.99675325
|
|
0.99837398 1. 0.99918633 0.99837398]
|
|
|
|
mean value: 0.998861787113732
|
|
|
|
key: test_precision
|
|
value: [1. 0.95774648 0.93243243 0.90789474 0.88311688 0.93150685
|
|
0.91891892 0.95774648 0.95774648 0.98550725]
|
|
|
|
mean value: 0.9432616503621938
|
|
|
|
key: train_precision
|
|
value: [0.99837398 0.99837398 0.99674797 1. 0.99837398 0.99352751
|
|
0.99675325 1. 0.99837398 0.99675325]
|
|
|
|
mean value: 0.9977277904036133
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.97826087 0.96323529 0.94852941 0.93382353 0.96323529
|
|
0.95588235 0.97794118 0.97794118 0.99264706]
|
|
|
|
mean value: 0.9691496163682864
|
|
|
|
key: train_roc_auc
|
|
value: [0.99918434 0.99918434 0.99837134 1. 0.99918567 0.99674267
|
|
0.99837134 1. 0.99918567 0.99837134]
|
|
|
|
mean value: 0.9988596691659006
|
|
|
|
key: test_jcc
|
|
value: [1. 0.95774648 0.93243243 0.90789474 0.88311688 0.93150685
|
|
0.91891892 0.95774648 0.95774648 0.98550725]
|
|
|
|
mean value: 0.9432616503621938
|
|
|
|
key: train_jcc
|
|
value: [0.99837398 0.99837398 0.99674797 1. 0.99837398 0.99352751
|
|
0.99675325 1. 0.99837398 0.99675325]
|
|
|
|
mean value: 0.9977277904036133
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.34814286 1.62543893 1.64354491 1.46121097 1.30118752 1.36093926
|
|
1.41096067 1.34231901 1.25550342 1.57961941]
|
|
|
|
mean value: 1.4328866958618165
|
|
|
|
key: score_time
|
|
value: [0.08763576 0.07004118 0.08638644 0.06822419 0.05681825 0.07667327
|
|
0.08930254 0.07778025 0.07974267 0.04862738]
|
|
|
|
mean value: 0.0741231918334961
|
|
|
|
key: test_mcc
|
|
value: [0.90025835 0.91281179 0.92709446 0.88920184 0.8753478 0.90184995
|
|
0.8722811 0.88388348 0.8623165 0.89949371]
|
|
|
|
mean value: 0.8924538968583584
|
|
|
|
key: train_mcc
|
|
value: [0.97232223 0.97558234 0.97235367 0.97555143 0.97232431 0.97070464
|
|
0.97234494 0.96744724 0.97070464 0.96911836]
|
|
|
|
mean value: 0.9718453807362983
|
|
|
|
key: test_accuracy
|
|
value: [0.94890511 0.95620438 0.96350365 0.94160584 0.93382353 0.94852941
|
|
0.93382353 0.94117647 0.92647059 0.94852941]
|
|
|
|
mean value: 0.9442571919278661
|
|
|
|
key: train_accuracy
|
|
value: [0.98614507 0.98777506 0.98614507 0.98777506 0.98615635 0.98534202
|
|
0.98615635 0.98371336 0.98534202 0.98452769]
|
|
|
|
mean value: 0.9859078045814983
|
|
|
|
key: test_fscore
|
|
value: [0.95035461 0.95652174 0.96402878 0.94520548 0.93793103 0.95104895
|
|
0.93706294 0.94285714 0.93150685 0.95035461]
|
|
|
|
mean value: 0.946687213018592
|
|
|
|
key: train_fscore
|
|
value: [0.98621249 0.98783455 0.98621249 0.98777506 0.98619009 0.98538961
|
|
0.98621249 0.98376623 0.98538961 0.98461538]
|
|
|
|
mean value: 0.9859598009108499
|
|
|
|
key: test_precision
|
|
value: [0.91780822 0.94285714 0.95714286 0.8961039 0.88311688 0.90666667
|
|
0.89333333 0.91666667 0.87179487 0.91780822]
|
|
|
|
mean value: 0.9103298756038481
|
|
|
|
key: train_precision
|
|
value: [0.9822294 0.98384491 0.98064516 0.98697068 0.98379254 0.98220065
|
|
0.9822294 0.98058252 0.98220065 0.97906602]
|
|
|
|
mean value: 0.9823761946884859
|
|
|
|
key: test_recall
|
|
value: [0.98529412 0.97058824 0.97101449 1. 1. 1.
|
|
0.98529412 0.97058824 1. 0.98529412]
|
|
|
|
mean value: 0.9868073316283035
|
|
|
|
key: train_recall
|
|
value: [0.99022801 0.99185668 0.99184339 0.98858075 0.98859935 0.98859935
|
|
0.99022801 0.98697068 0.98859935 0.99022801]
|
|
|
|
mean value: 0.9895733589810353
|
|
|
|
key: test_roc_auc
|
|
value: [0.9491688 0.95630861 0.96344842 0.94117647 0.93382353 0.94852941
|
|
0.93382353 0.94117647 0.92647059 0.94852941]
|
|
|
|
mean value: 0.9442455242966752
|
|
|
|
key: train_roc_auc
|
|
value: [0.98614174 0.98777173 0.98614971 0.98777572 0.98615635 0.98534202
|
|
0.98615635 0.98371336 0.98534202 0.98452769]
|
|
|
|
mean value: 0.9859076682731905
|
|
|
|
key: test_jcc
|
|
value: [0.90540541 0.91666667 0.93055556 0.8961039 0.88311688 0.90666667
|
|
0.88157895 0.89189189 0.87179487 0.90540541]
|
|
|
|
mean value: 0.8989186189975663
|
|
|
|
key: train_jcc
|
|
value: [0.9728 0.97596154 0.9728 0.97584541 0.97275641 0.9712
|
|
0.9728 0.96805112 0.9712 0.96969697]
|
|
|
|
mean value: 0.97231114472538
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.52704644 1.46939826 1.52141833 3.05834937 1.42187619 1.40447807
|
|
1.40688801 1.39297247 1.40064192 1.55902958]
|
|
|
|
mean value: 1.616209864616394
|
|
|
|
key: score_time
|
|
value: [0.01008701 0.00988674 0.01376224 0.00981331 0.01079798 0.00973082
|
|
0.00974679 0.00969934 0.00993872 0.01312947]
|
|
|
|
mean value: 0.010659241676330566
|
|
|
|
key: test_mcc
|
|
value: [1. 0.92951942 0.94318882 0.90246052 0.90184995 0.92898531
|
|
0.92898531 0.95681396 0.94280904 0.97100831]
|
|
|
|
mean value: 0.9405620639400689
|
|
|
|
key: train_mcc
|
|
value: [0.99188292 0.9886543 0.98865451 0.98543628 0.99674796 0.9902753
|
|
0.9902753 0.99188957 0.98705447 0.98705447]
|
|
|
|
mean value: 0.9897925072080538
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96350365 0.97080292 0.94890511 0.94852941 0.96323529
|
|
0.96323529 0.97794118 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9692035208243881
|
|
|
|
key: train_accuracy
|
|
value: [0.99592502 0.99429503 0.99429503 0.99266504 0.99837134 0.99511401
|
|
0.99511401 0.99592834 0.99348534 0.99348534]
|
|
|
|
mean value: 0.9948678485434934
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96453901 0.97183099 0.95172414 0.95104895 0.96453901
|
|
0.96453901 0.97841727 0.97142857 0.98550725]
|
|
|
|
mean value: 0.9703574180164507
|
|
|
|
key: train_fscore
|
|
value: [0.99594485 0.99433198 0.99432279 0.99271255 0.99837398 0.99513776
|
|
0.99513776 0.99594485 0.99352751 0.99352751]
|
|
|
|
mean value: 0.9948961550938449
|
|
|
|
key: test_precision
|
|
value: [1. 0.93150685 0.94520548 0.90789474 0.90666667 0.93150685
|
|
0.93150685 0.95774648 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9427906925652287
|
|
|
|
key: train_precision
|
|
value: [0.99192246 0.98872786 0.98870968 0.98553055 0.99675325 0.99032258
|
|
0.99032258 0.99192246 0.98713826 0.98713826]
|
|
|
|
mean value: 0.9898487928857995
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.96376812 0.97058824 0.94852941 0.94852941 0.96323529
|
|
0.96323529 0.97794118 0.97058824 0.98529412]
|
|
|
|
mean value: 0.9691709292412618
|
|
|
|
key: train_roc_auc
|
|
value: [0.9959217 0.99429038 0.99429967 0.99267101 0.99837134 0.99511401
|
|
0.99511401 0.99592834 0.99348534 0.99348534]
|
|
|
|
mean value: 0.9948681127152733
|
|
|
|
key: test_jcc
|
|
value: [1. 0.93150685 0.94520548 0.90789474 0.90666667 0.93150685
|
|
0.93150685 0.95774648 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9427906925652287
|
|
|
|
key: train_jcc
|
|
value: [0.99192246 0.98872786 0.98870968 0.98553055 0.99675325 0.99032258
|
|
0.99032258 0.99192246 0.98713826 0.98713826]
|
|
|
|
mean value: 0.9898487928857995
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.18439102 0.16080046 0.20425344 0.16646791 0.21126723 0.14091587
|
|
0.19247937 0.15423417 0.18020916 0.15448427]
|
|
|
|
mean value: 0.1749502897262573
|
|
|
|
key: score_time
|
|
value: [0.02884722 0.02191401 0.04009485 0.03732443 0.02903533 0.03681111
|
|
0.02155042 0.04020762 0.04215455 0.03590298]
|
|
|
|
mean value: 0.03338425159454346
|
|
|
|
key: test_mcc
|
|
value: [0.95713391 1. 1. 1. 1. 1.
|
|
0.97100831 1. 0.95681396 1. ]
|
|
|
|
mean value: 0.9884956187492568
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97810219 1. 1. 1. 1. 1.
|
|
0.98529412 1. 0.97794118 1. ]
|
|
|
|
mean value: 0.9941337483898669
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97841727 1. 1. 1. 1. 1.
|
|
0.98550725 1. 0.97841727 1. ]
|
|
|
|
mean value: 0.9942341778750912
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95774648 1. 1. 1. 1. 1.
|
|
0.97142857 1. 0.95774648 1. ]
|
|
|
|
mean value: 0.988692152917505
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97826087 1. 1. 1. 1. 1.
|
|
0.98529412 1. 0.97794118 1. ]
|
|
|
|
mean value: 0.9941496163682865
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.95774648 1. 1. 1. 1. 1.
|
|
0.97142857 1. 0.95774648 1. ]
|
|
|
|
mean value: 0.988692152917505
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06424856 0.0752933 0.07174444 0.06764746 0.06863308 0.06759048
|
|
0.0916636 0.07553363 0.07163095 0.07230806]
|
|
|
|
mean value: 0.07262935638427734
|
|
|
|
key: score_time
|
|
value: [0.04801512 0.03882694 0.02715278 0.02783155 0.03385282 0.02948785
|
|
0.02757883 0.02810884 0.02770877 0.03306103]
|
|
|
|
mean value: 0.03216245174407959
|
|
|
|
key: test_mcc
|
|
value: [0.88654289 0.8251972 0.82480818 0.82788248 0.75008111 0.76603235
|
|
0.79549513 0.88273483 0.85442069 0.87000211]
|
|
|
|
mean value: 0.8283196987346125
|
|
|
|
key: train_mcc
|
|
value: [0.85838091 0.88143034 0.87309027 0.87137251 0.88488253 0.87307997
|
|
0.87638889 0.87175299 0.88330335 0.87485394]
|
|
|
|
mean value: 0.8748535686741495
|
|
|
|
key: test_accuracy
|
|
value: [0.94160584 0.91240876 0.91240876 0.91240876 0.875 0.88235294
|
|
0.89705882 0.94117647 0.92647059 0.93382353]
|
|
|
|
mean value: 0.9134714469729498
|
|
|
|
key: train_accuracy
|
|
value: [0.92909535 0.9405053 0.93643032 0.93561532 0.94218241 0.93648208
|
|
0.93811075 0.93566775 0.94136808 0.93729642]
|
|
|
|
mean value: 0.9372753783625218
|
|
|
|
key: test_fscore
|
|
value: [0.93846154 0.91304348 0.91304348 0.91666667 0.87591241 0.88571429
|
|
0.9 0.94202899 0.92857143 0.93617021]
|
|
|
|
mean value: 0.9149612482967987
|
|
|
|
key: train_fscore
|
|
value: [0.92989525 0.9414595 0.93709677 0.93613581 0.94315452 0.93699515
|
|
0.93870968 0.93664796 0.9424 0.9380531 ]
|
|
|
|
mean value: 0.9380547742168248
|
|
|
|
key: test_precision
|
|
value: [0.98387097 0.9 0.91304348 0.88 0.86956522 0.86111111
|
|
0.875 0.92857143 0.90277778 0.90410959]
|
|
|
|
mean value: 0.9018049569895523
|
|
|
|
key: train_precision
|
|
value: [0.92025518 0.92733017 0.92663477 0.92788462 0.92755906 0.92948718
|
|
0.92971246 0.92259084 0.92610063 0.92686804]
|
|
|
|
mean value: 0.9264422946711286
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.92647059 0.91304348 0.95652174 0.88235294 0.91176471
|
|
0.92647059 0.95588235 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9296035805626599
|
|
|
|
key: train_recall
|
|
value: [0.93973941 0.95602606 0.94779772 0.94453507 0.95928339 0.94462541
|
|
0.94788274 0.95114007 0.95928339 0.9495114 ]
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:196: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_cd_sl.py:199: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
|
|
mean value: 0.9499824646237067
|
|
|
|
key: test_roc_auc
|
|
value: [0.94128303 0.91251066 0.91240409 0.9120844 0.875 0.88235294
|
|
0.89705882 0.94117647 0.92647059 0.93382353]
|
|
|
|
mean value: 0.913416453537937
|
|
|
|
key: train_roc_auc
|
|
value: [0.92908667 0.94049264 0.93643957 0.93562259 0.94218241 0.93648208
|
|
0.93811075 0.93566775 0.94136808 0.93729642]
|
|
|
|
mean value: 0.9372748962490236
|
|
|
|
key: test_jcc
|
|
value: [0.88405797 0.84 0.84 0.84615385 0.77922078 0.79487179
|
|
0.81818182 0.89041096 0.86666667 0.88 ]
|
|
|
|
mean value: 0.8439563835013507
|
|
|
|
key: train_jcc
|
|
value: [0.8689759 0.88939394 0.88163885 0.87993921 0.89242424 0.88145897
|
|
0.88449848 0.88084465 0.89107413 0.88333333]
|
|
|
|
mean value: 0.8833581697694837
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.65006852 0.65783858 0.64332271 0.60425997 0.60372329 0.5961287
|
|
0.64778042 0.60070848 0.55460596 0.62646484]
|
|
|
|
mean value: 0.6184901475906373
|
|
|
|
key: score_time
|
|
value: [0.02721882 0.02727461 0.02708626 0.02812886 0.02766991 0.03826523
|
|
0.02749848 0.05201983 0.02817082 0.02796197]
|
|
|
|
mean value: 0.03112947940826416
|
|
|
|
key: test_mcc
|
|
value: [0.88654289 0.84393916 0.83951407 0.82788248 0.72129053 0.76603235
|
|
0.79549513 0.84051051 0.8131434 0.87000211]
|
|
|
|
mean value: 0.8204352632641159
|
|
|
|
key: train_mcc
|
|
value: [0.85838091 0.87678779 0.88469777 0.87137251 0.88015146 0.87307997
|
|
0.87638889 0.87175299 0.87811224 0.87485394]
|
|
|
|
mean value: 0.8745578471519682
|
|
|
|
key: test_accuracy
|
|
value: [0.94160584 0.91970803 0.91970803 0.91240876 0.86029412 0.88235294
|
|
0.89705882 0.91911765 0.90441176 0.93382353]
|
|
|
|
mean value: 0.9090489480463718
|
|
|
|
key: train_accuracy
|
|
value: [0.92909535 0.93806031 0.94213529 0.93561532 0.93973941 0.93648208
|
|
0.93811075 0.93566775 0.93892508 0.93729642]
|
|
|
|
mean value: 0.9371127773839958
|
|
|
|
key: test_fscore
|
|
value: [0.93846154 0.92307692 0.91970803 0.91666667 0.86330935 0.88571429
|
|
0.9 0.92198582 0.90909091 0.93617021]
|
|
|
|
mean value: 0.9114183733094183
|
|
|
|
key: train_fscore
|
|
value: [0.92989525 0.93929712 0.94297189 0.93613581 0.94089457 0.93699515
|
|
0.93870968 0.93664796 0.93966211 0.9380531 ]
|
|
|
|
mean value: 0.9379262630193704
|
|
|
|
key: test_precision
|
|
value: [0.98387097 0.88 0.92647059 0.88 0.84507042 0.86111111
|
|
0.875 0.89041096 0.86666667 0.90410959]
|
|
|
|
mean value: 0.8912710304235424
|
|
|
|
key: train_precision
|
|
value: [0.92025518 0.92163009 0.92879747 0.92788462 0.92319749 0.92948718
|
|
0.92971246 0.92259084 0.92845787 0.92686804]
|
|
|
|
mean value: 0.9258881244342322
|
|
|
|
key: test_recall
|
|
value: [0.89705882 0.97058824 0.91304348 0.95652174 0.88235294 0.91176471
|
|
0.92647059 0.95588235 0.95588235 0.97058824]
|
|
|
|
mean value: 0.9340153452685422
|
|
|
|
key: train_recall
|
|
value: [0.93973941 0.95765472 0.95758564 0.94453507 0.95928339 0.94462541
|
|
0.94788274 0.95114007 0.95114007 0.9495114 ]
|
|
|
|
mean value: 0.9503097916478471
|
|
|
|
key: test_roc_auc
|
|
value: [0.94128303 0.92007673 0.91975703 0.9120844 0.86029412 0.88235294
|
|
0.89705882 0.91911765 0.90441176 0.93382353]
|
|
|
|
mean value: 0.9090260017050299
|
|
|
|
key: train_roc_auc
|
|
value: [0.92908667 0.93804433 0.94214787 0.93562259 0.93973941 0.93648208
|
|
0.93811075 0.93566775 0.93892508 0.93729642]
|
|
|
|
mean value: 0.9371122954870318
|
|
|
|
key: test_jcc
|
|
value: [0.88405797 0.85714286 0.85135135 0.84615385 0.75949367 0.79487179
|
|
0.81818182 0.85526316 0.83333333 0.88 ]
|
|
|
|
mean value: 0.8379849800830307
|
|
|
|
key: train_jcc
|
|
value: [0.8689759 0.88554217 0.89209726 0.87993921 0.88838612 0.88145897
|
|
0.88449848 0.88084465 0.8861912 0.88333333]
|
|
|
|
mean value: 0.8831267294611943
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.89
|