19928 lines
991 KiB
Text
19928 lines
991 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_cd_sl.py:548: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 817
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 817
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 168
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 175
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data with stratification according to scaling law [COMPLETE data]: 1/sqrt(x_ncols)
|
|
Input features data size: (817, 175)
|
|
Train data size: (755, 175)
|
|
Test data size: (62, 175)
|
|
y_train numbers: Counter({0: 437, 1: 318})
|
|
y_train ratio: 1.3742138364779874
|
|
|
|
y_test_numbers: Counter({0: 36, 1: 26})
|
|
y_test ratio: 1.3846153846153846
|
|
-------------------------------------------------------------
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: True
|
|
Original Data
|
|
Counter({0: 437, 1: 318}) Data dim: (755, 175)
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 437, 1: 437})
|
|
(874, 175)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 318, 1: 318})
|
|
(636, 175)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 437, 1: 437})
|
|
(874, 175)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 437, 1: 437})
|
|
(874, 175)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis [COMPLETE DATA]: 70/30 split
|
|
Gene name: katG
|
|
Drug name: isoniazid
|
|
|
|
Output directory: /home/tanu/git/Data/isoniazid/output/ml/tts_cd_sl/
|
|
|
|
Sanity checks:
|
|
Total input features: 175
|
|
|
|
Training data size: (755, 175)
|
|
Test data size: (62, 175)
|
|
|
|
Target feature numbers (training data): Counter({0: 437, 1: 318})
|
|
Target features ratio (training data: 1.3742138364779874
|
|
|
|
Target feature numbers (test data): Counter({0: 36, 1: 26})
|
|
Target features ratio (test data): 1.3846153846153846
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
These are:
|
|
['ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03845477 0.04111099 0.04018474 0.04097939 0.03893089 0.03982425
|
|
0.04032898 0.0402391 0.03961587 0.0398469 ]
|
|
|
|
mean value: 0.03995158672332764
|
|
|
|
key: score_time
|
|
value: [0.01270437 0.01245141 0.01341653 0.01328707 0.01330495 0.01333356
|
|
0.01330638 0.01323581 0.01337004 0.01323414]
|
|
|
|
mean value: 0.013164424896240234
|
|
|
|
key: test_mcc
|
|
value: [0.41185791 0.51905381 0.41185791 0.37674326 0.70337995 0.51056179
|
|
0.48661327 0.54396846 0.69927678 0.61631563]
|
|
|
|
mean value: 0.5279628780527089
|
|
|
|
key: train_mcc
|
|
value: [0.64996517 0.64392978 0.665228 0.65217074 0.63909822 0.64501189
|
|
0.64719164 0.67987312 0.62943216 0.65474146]
|
|
|
|
mean value: 0.650664217640774
|
|
|
|
key: test_accuracy
|
|
value: [0.71052632 0.76315789 0.71052632 0.69736842 0.85526316 0.76
|
|
0.74666667 0.77333333 0.85333333 0.81333333]
|
|
|
|
mean value: 0.7683508771929825
|
|
|
|
key: train_accuracy
|
|
value: [0.82916053 0.82621502 0.8365243 0.83063328 0.82326951 0.82647059
|
|
0.82647059 0.84411765 0.81911765 0.83088235]
|
|
|
|
mean value: 0.8292861474486701
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.72727273 0.66666667 0.63492063 0.81355932 0.71875
|
|
0.70769231 0.74626866 0.82539683 0.77419355]
|
|
|
|
mean value: 0.7281387355753242
|
|
|
|
key: train_fscore
|
|
value: [0.79790941 0.79442509 0.80695652 0.79789104 0.79310345 0.79584775
|
|
0.79931973 0.81403509 0.78608696 0.8020654 ]
|
|
|
|
mean value: 0.7987640429167654
|
|
|
|
key: test_precision
|
|
value: [0.64705882 0.70588235 0.64705882 0.64516129 0.88888889 0.6969697
|
|
0.67647059 0.71428571 0.83870968 0.8 ]
|
|
|
|
mean value: 0.726048585612153
|
|
|
|
key: train_precision
|
|
value: [0.79513889 0.79166667 0.80276817 0.80212014 0.78231293 0.79037801
|
|
0.7807309 0.81690141 0.78200692 0.78983051]
|
|
|
|
mean value: 0.793385452938167
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.75 0.6875 0.625 0.75 0.74193548
|
|
0.74193548 0.78125 0.8125 0.75 ]
|
|
|
|
mean value: 0.7327620967741936
|
|
|
|
key: train_recall
|
|
value: [0.8006993 0.7972028 0.81118881 0.79370629 0.8041958 0.80139373
|
|
0.81881533 0.81118881 0.79020979 0.81468531]
|
|
|
|
mean value: 0.8043285982310373
|
|
|
|
key: test_roc_auc
|
|
value: [0.70738636 0.76136364 0.70738636 0.6875 0.84090909 0.75733138
|
|
0.74596774 0.77434593 0.84811047 0.80523256]
|
|
|
|
mean value: 0.7635533528268431
|
|
|
|
key: train_roc_auc
|
|
value: [0.82528604 0.82226552 0.83307532 0.82560633 0.82067297 0.82308872
|
|
0.8254382 0.83960456 0.81515566 0.82866245]
|
|
|
|
mean value: 0.8258855762883787
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.57142857 0.5 0.46511628 0.68571429 0.56097561
|
|
0.54761905 0.5952381 0.7027027 0.63157895]
|
|
|
|
mean value: 0.5760373538896989
|
|
|
|
key: train_jcc
|
|
value: [0.66376812 0.65895954 0.67638484 0.66374269 0.65714286 0.66091954
|
|
0.66572238 0.68639053 0.64756447 0.66954023]
|
|
|
|
mean value: 0.6650135192542527
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.8990736 1.00928378 0.92249584 1.06357527 0.90100861 1.05567336
|
|
0.92924047 0.98903608 0.91987348 0.93955755]
|
|
|
|
mean value: 0.9628818035125732
|
|
|
|
key: score_time
|
|
value: [0.01704144 0.0151968 0.01524448 0.01529384 0.01529455 0.01550865
|
|
0.01517725 0.01560426 0.01515341 0.01526475]
|
|
|
|
mean value: 0.015477943420410156
|
|
|
|
key: test_mcc
|
|
value: [0.49527383 0.45626404 0.4822 0.48956862 0.6198304 0.60266409
|
|
0.75402183 0.52770861 0.64361974 0.45993751]
|
|
|
|
mean value: 0.5531088664051463
|
|
|
|
key: train_mcc
|
|
value: [0.78145361 0.76267653 0.76943669 0.76836323 0.77438554 0.77167332
|
|
0.78028857 0.71665959 0.74490432 0.78769765]
|
|
|
|
mean value: 0.7657539050746246
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.73684211 0.75 0.75 0.81578947 0.8
|
|
0.88 0.76 0.82666667 0.73333333]
|
|
|
|
mean value: 0.7802631578947369
|
|
|
|
key: train_accuracy
|
|
value: [0.89248895 0.88365243 0.88659794 0.88659794 0.88954345 0.88823529
|
|
0.89264706 0.86176471 0.875 0.89558824]
|
|
|
|
mean value: 0.8852116001039592
|
|
|
|
key: test_fscore
|
|
value: [0.71641791 0.67741935 0.68852459 0.70769231 0.77419355 0.7761194
|
|
0.85714286 0.74285714 0.78688525 0.6969697 ]
|
|
|
|
mean value: 0.742422205738622
|
|
|
|
key: train_fscore
|
|
value: [0.87521368 0.86402754 0.86837607 0.86701209 0.87046632 0.86896552
|
|
0.87348354 0.83623693 0.85370052 0.87863248]
|
|
|
|
mean value: 0.86561146749211
|
|
|
|
key: test_precision
|
|
value: [0.68571429 0.7 0.72413793 0.6969697 0.8 0.72222222
|
|
0.84375 0.68421053 0.82758621 0.67647059]
|
|
|
|
mean value: 0.7361061457388323
|
|
|
|
key: train_precision
|
|
value: [0.85618729 0.85084746 0.84949833 0.85665529 0.86006826 0.86006826
|
|
0.86896552 0.83333333 0.84067797 0.85953177]
|
|
|
|
mean value: 0.8535833474481594
|
|
|
|
key: test_recall
|
|
value: [0.75 0.65625 0.65625 0.71875 0.75 0.83870968
|
|
0.87096774 0.8125 0.75 0.71875 ]
|
|
|
|
mean value: 0.7522177419354839
|
|
|
|
key: train_recall
|
|
value: [0.8951049 0.87762238 0.88811189 0.87762238 0.88111888 0.87804878
|
|
0.87804878 0.83916084 0.86713287 0.8986014 ]
|
|
|
|
mean value: 0.8780573085451134
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.72585227 0.73721591 0.74573864 0.80681818 0.80571848
|
|
0.87866569 0.76671512 0.81686047 0.73146802]
|
|
|
|
mean value: 0.7765052768874037
|
|
|
|
key: train_roc_auc
|
|
value: [0.89284507 0.88283155 0.88680404 0.88537607 0.88839659 0.88686154
|
|
0.89067833 0.85866671 0.87392176 0.89600121]
|
|
|
|
mean value: 0.8842382873178545
|
|
|
|
key: test_jcc
|
|
value: [0.55813953 0.51219512 0.525 0.54761905 0.63157895 0.63414634
|
|
0.75 0.59090909 0.64864865 0.53488372]
|
|
|
|
mean value: 0.5933120453773796
|
|
|
|
key: train_jcc
|
|
value: [0.7781155 0.76060606 0.7673716 0.7652439 0.7706422 0.76829268
|
|
0.77538462 0.71856287 0.74474474 0.78353659]
|
|
|
|
mean value: 0.7632500770281704
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0166285 0.01156783 0.01255155 0.01241207 0.01223373 0.01212716
|
|
0.01170945 0.01129079 0.01182938 0.01191139]
|
|
|
|
mean value: 0.012426185607910156
|
|
|
|
key: score_time
|
|
value: [0.01285315 0.01049924 0.01024437 0.0101192 0.00957632 0.00992823
|
|
0.00999331 0.01004028 0.00996041 0.01002264]
|
|
|
|
mean value: 0.010323715209960938
|
|
|
|
key: test_mcc
|
|
value: [0.31980107 0.27449801 0.31694616 0.40625 0.56818182 0.54100373
|
|
0.34692019 0.37816125 0.45123109 0.39714245]
|
|
|
|
mean value: 0.4000135759249381
|
|
|
|
key: train_mcc
|
|
value: [0.44674155 0.46040905 0.44846705 0.45967633 0.44505203 0.43840782
|
|
0.43709404 0.41215522 0.43630676 0.43976523]
|
|
|
|
mean value: 0.44240750857578764
|
|
|
|
key: test_accuracy
|
|
value: [0.65789474 0.64473684 0.64473684 0.71052632 0.78947368 0.77333333
|
|
0.68 0.64 0.73333333 0.69333333]
|
|
|
|
mean value: 0.6967368421052632
|
|
|
|
key: train_accuracy
|
|
value: [0.72312224 0.73048601 0.72459499 0.7275405 0.72164948 0.71764706
|
|
0.71470588 0.68382353 0.72794118 0.71911765]
|
|
|
|
mean value: 0.7190628519449016
|
|
|
|
key: test_fscore
|
|
value: [0.62857143 0.58461538 0.64 0.65625 0.75 0.73846154
|
|
0.625 0.68235294 0.67741935 0.67605634]
|
|
|
|
mean value: 0.6658726985691701
|
|
|
|
key: train_fscore
|
|
value: [0.69381107 0.700491 0.69394435 0.70304976 0.69367909 0.69131833
|
|
0.69303797 0.68975469 0.66179159 0.6904376 ]
|
|
|
|
mean value: 0.6911315462615466
|
|
|
|
key: test_precision
|
|
value: [0.57894737 0.57575758 0.55813953 0.65625 0.75 0.70588235
|
|
0.60606061 0.54716981 0.7 0.61538462]
|
|
|
|
mean value: 0.6293591864769502
|
|
|
|
key: train_precision
|
|
value: [0.64939024 0.65846154 0.65230769 0.64985163 0.64652568 0.64179104
|
|
0.63478261 0.58722359 0.69348659 0.64350453]
|
|
|
|
mean value: 0.6457325148933183
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.59375 0.75 0.65625 0.75 0.77419355
|
|
0.64516129 0.90625 0.65625 0.75 ]
|
|
|
|
mean value: 0.7169354838709677
|
|
|
|
key: train_recall
|
|
value: [0.74475524 0.74825175 0.74125874 0.76573427 0.74825175 0.74912892
|
|
0.7630662 0.83566434 0.63286713 0.74475524]
|
|
|
|
mean value: 0.7473733583489681
|
|
|
|
key: test_roc_auc
|
|
value: [0.66193182 0.63778409 0.65909091 0.703125 0.78409091 0.77346041
|
|
0.67485337 0.67405523 0.72347384 0.7005814 ]
|
|
|
|
mean value: 0.6992446975380209
|
|
|
|
key: train_roc_auc
|
|
value: [0.72606719 0.7329045 0.72686347 0.73273991 0.72527091 0.7218927
|
|
0.72122776 0.7046342 0.71491072 0.72263143]
|
|
|
|
mean value: 0.7229142789213228
|
|
|
|
key: test_jcc
|
|
value: [0.45833333 0.41304348 0.47058824 0.48837209 0.6 0.58536585
|
|
0.45454545 0.51785714 0.51219512 0.5106383 ]
|
|
|
|
mean value: 0.501093901079627
|
|
|
|
key: train_jcc
|
|
value: [0.53117207 0.53904282 0.53132832 0.54207921 0.53101737 0.52825553
|
|
0.53026634 0.52643172 0.49453552 0.52722772]
|
|
|
|
mean value: 0.52813566214748
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0125792 0.01233125 0.01156139 0.01170778 0.01114821 0.01106834
|
|
0.01238728 0.01232433 0.01200199 0.0123198 ]
|
|
|
|
mean value: 0.01194295883178711
|
|
|
|
key: score_time
|
|
value: [0.01004314 0.00967956 0.00956941 0.0099113 0.00907087 0.00946569
|
|
0.00997758 0.00993609 0.00979972 0.01003027]
|
|
|
|
mean value: 0.009748363494873047
|
|
|
|
key: test_mcc
|
|
value: [0.59365605 0.28140559 0.17447146 0.48519965 0.48956862 0.46313625
|
|
0.35577952 0.38895144 0.5603175 0.25810271]
|
|
|
|
mean value: 0.40505887826745546
|
|
|
|
key: train_mcc
|
|
value: [0.432963 0.46234808 0.46694919 0.46570418 0.44960003 0.44154422
|
|
0.43213035 0.44706869 0.45428806 0.45555535]
|
|
|
|
mean value: 0.450815115609883
|
|
|
|
key: test_accuracy
|
|
value: [0.80263158 0.64473684 0.59210526 0.75 0.75 0.73333333
|
|
0.69333333 0.69333333 0.78666667 0.62666667]
|
|
|
|
mean value: 0.707280701754386
|
|
|
|
key: train_accuracy
|
|
value: [0.72164948 0.73784978 0.73784978 0.73784978 0.72901325 0.72647059
|
|
0.71911765 0.72941176 0.73235294 0.73235294]
|
|
|
|
mean value: 0.7303917958936151
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.59701493 0.53731343 0.6984127 0.70769231 0.6969697
|
|
0.59649123 0.66666667 0.73333333 0.6 ]
|
|
|
|
mean value: 0.6595799051258595
|
|
|
|
key: train_fscore
|
|
value: [0.67692308 0.68881119 0.69727891 0.69520548 0.68813559 0.68041237
|
|
0.68113523 0.68275862 0.68835616 0.69047619]
|
|
|
|
mean value: 0.686949282203034
|
|
|
|
key: test_precision
|
|
value: [0.77419355 0.57142857 0.51428571 0.70967742 0.6969697 0.65714286
|
|
0.65384615 0.62162162 0.78571429 0.55263158]
|
|
|
|
mean value: 0.6537511447698205
|
|
|
|
key: train_precision
|
|
value: [0.66220736 0.68881119 0.67880795 0.68120805 0.66776316 0.67118644
|
|
0.65384615 0.67346939 0.67449664 0.67218543]
|
|
|
|
mean value: 0.67239817623147
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.5625 0.6875 0.71875 0.74193548
|
|
0.5483871 0.71875 0.6875 0.65625 ]
|
|
|
|
mean value: 0.6696572580645161
|
|
|
|
key: train_recall
|
|
value: [0.69230769 0.68881119 0.71678322 0.70979021 0.70979021 0.68989547
|
|
0.71080139 0.69230769 0.7027972 0.70979021]
|
|
|
|
mean value: 0.702307448648912
|
|
|
|
key: test_roc_auc
|
|
value: [0.79545455 0.64204545 0.58806818 0.74147727 0.74573864 0.73460411
|
|
0.67192082 0.6965843 0.77398256 0.63045058]
|
|
|
|
mean value: 0.7020326459455773
|
|
|
|
key: train_roc_auc
|
|
value: [0.71765512 0.73117404 0.73498194 0.73402996 0.72639638 0.72153807
|
|
0.71799612 0.72432643 0.72830215 0.72926059]
|
|
|
|
mean value: 0.7265660801452282
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.42553191 0.36734694 0.53658537 0.54761905 0.53488372
|
|
0.425 0.5 0.57894737 0.42857143]
|
|
|
|
mean value: 0.4959870400449163
|
|
|
|
key: train_jcc
|
|
value: [0.51162791 0.52533333 0.53524804 0.5328084 0.5245478 0.515625
|
|
0.5164557 0.51832461 0.52480418 0.52727273]
|
|
|
|
mean value: 0.523204769300403
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01447773 0.01143241 0.01164007 0.01019645 0.01123261 0.01122975
|
|
0.01132512 0.01041198 0.01103187 0.01116967]
|
|
|
|
mean value: 0.011414766311645508
|
|
|
|
key: score_time
|
|
value: [0.08108211 0.01340771 0.01348948 0.013273 0.01390386 0.01398015
|
|
0.01823759 0.01867867 0.01493979 0.01354218]
|
|
|
|
mean value: 0.021453452110290528
|
|
|
|
key: test_mcc
|
|
value: [0.46022727 0.39836355 0.22765527 0.39617931 0.45626404 0.06268839
|
|
0.30951038 0.2369186 0.42015928 0.28614654]
|
|
|
|
mean value: 0.32541126471942616
|
|
|
|
key: train_mcc
|
|
value: [0.52476721 0.56995237 0.5594584 0.54382548 0.53998842 0.56545966
|
|
0.5811401 0.54484286 0.55700453 0.55603493]
|
|
|
|
mean value: 0.554247395764335
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.71052632 0.63157895 0.71052632 0.73684211 0.56
|
|
0.66666667 0.62666667 0.72 0.65333333]
|
|
|
|
mean value: 0.6752982456140351
|
|
|
|
key: train_accuracy
|
|
value: [0.77025037 0.79086892 0.78645066 0.77908689 0.77761414 0.78970588
|
|
0.79705882 0.77941176 0.78529412 0.78529412]
|
|
|
|
mean value: 0.7841035692627567
|
|
|
|
key: test_fscore
|
|
value: [0.6875 0.63333333 0.51724138 0.62068966 0.67741935 0.4
|
|
0.59016393 0.5625 0.61818182 0.58064516]
|
|
|
|
mean value: 0.5887674636553172
|
|
|
|
key: train_fscore
|
|
value: [0.71532847 0.74911661 0.73967684 0.72924188 0.72394881 0.73952641
|
|
0.75090253 0.7311828 0.73835125 0.73454545]
|
|
|
|
mean value: 0.7351821047557114
|
|
|
|
key: test_precision
|
|
value: [0.6875 0.67857143 0.57692308 0.69230769 0.7 0.45833333
|
|
0.6 0.5625 0.73913043 0.6 ]
|
|
|
|
mean value: 0.629526596591814
|
|
|
|
key: train_precision
|
|
value: [0.7480916 0.75714286 0.7601476 0.75373134 0.75862069 0.77480916
|
|
0.77902622 0.75 0.75735294 0.76515152]
|
|
|
|
mean value: 0.7604073928472855
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.59375 0.46875 0.5625 0.65625 0.35483871
|
|
0.58064516 0.5625 0.53125 0.5625 ]
|
|
|
|
mean value: 0.5560483870967742
|
|
|
|
key: train_recall
|
|
value: [0.68531469 0.74125874 0.72027972 0.70629371 0.69230769 0.70731707
|
|
0.72473868 0.71328671 0.72027972 0.70629371]
|
|
|
|
mean value: 0.7117370434443605
|
|
|
|
key: test_roc_auc
|
|
value: [0.73011364 0.69460227 0.609375 0.69034091 0.72585227 0.52969208
|
|
0.65395894 0.6184593 0.69585756 0.64171512]
|
|
|
|
mean value: 0.6589967094046238
|
|
|
|
key: train_roc_auc
|
|
value: [0.75868788 0.78411538 0.77744266 0.76917739 0.76600117 0.77859492
|
|
0.78730572 0.77034894 0.77638351 0.77446665]
|
|
|
|
mean value: 0.7742524227309505
|
|
|
|
key: test_jcc
|
|
value: [0.52380952 0.46341463 0.34883721 0.45 0.51219512 0.25
|
|
0.41860465 0.39130435 0.44736842 0.40909091]
|
|
|
|
mean value: 0.4214624818341829
|
|
|
|
key: train_jcc
|
|
value: [0.55681818 0.59887006 0.58689459 0.57386364 0.56733524 0.5867052
|
|
0.60115607 0.57627119 0.58522727 0.58045977]
|
|
|
|
mean value: 0.5813601206085782
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03710723 0.03756094 0.04211354 0.03792882 0.03758621 0.03751993
|
|
0.03698134 0.03782248 0.03819609 0.03786802]
|
|
|
|
mean value: 0.03806846141815186
|
|
|
|
key: score_time
|
|
value: [0.01552296 0.0156126 0.0167017 0.01760411 0.01570702 0.01563406
|
|
0.01553059 0.01514482 0.01581478 0.01570559]
|
|
|
|
mean value: 0.01589782238006592
|
|
|
|
key: test_mcc
|
|
value: [0.43580096 0.35227273 0.42733892 0.39836355 0.64678324 0.53504163
|
|
0.39516129 0.51409028 0.56404163 0.47754813]
|
|
|
|
mean value: 0.4746442352543855
|
|
|
|
key: train_mcc
|
|
value: [0.59772146 0.6009229 0.61621456 0.59600332 0.57394829 0.5881027
|
|
0.60569092 0.59718466 0.58048551 0.59083661]
|
|
|
|
mean value: 0.5947110918550393
|
|
|
|
key: test_accuracy
|
|
value: [0.72368421 0.68421053 0.72368421 0.71052632 0.82894737 0.77333333
|
|
0.70666667 0.76 0.78666667 0.74666667]
|
|
|
|
mean value: 0.7444385964912281
|
|
|
|
key: train_accuracy
|
|
value: [0.80412371 0.80559647 0.81296024 0.80412371 0.79381443 0.8
|
|
0.80735294 0.80441176 0.79705882 0.80147059]
|
|
|
|
mean value: 0.803091267434809
|
|
|
|
key: test_fscore
|
|
value: [0.67692308 0.625 0.6557377 0.63333333 0.78688525 0.73015873
|
|
0.64516129 0.72727273 0.71428571 0.68852459]
|
|
|
|
mean value: 0.688328241327977
|
|
|
|
key: train_fscore
|
|
value: [0.76625659 0.76842105 0.77758319 0.76122083 0.74545455 0.75800712
|
|
0.7729636 0.76376554 0.74909091 0.75935829]
|
|
|
|
mean value: 0.7622121663731163
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.625 0.68965517 0.67857143 0.82758621 0.71875
|
|
0.64516129 0.70588235 0.83333333 0.72413793]
|
|
|
|
mean value: 0.7114744382180014
|
|
|
|
key: train_precision
|
|
value: [0.77031802 0.77112676 0.77894737 0.78228782 0.77651515 0.77454545
|
|
0.76896552 0.77617329 0.78030303 0.77454545]
|
|
|
|
mean value: 0.7753727866413102
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.625 0.625 0.59375 0.75 0.74193548
|
|
0.64516129 0.75 0.625 0.65625 ]
|
|
|
|
mean value: 0.6699596774193548
|
|
|
|
key: train_recall
|
|
value: [0.76223776 0.76573427 0.77622378 0.74125874 0.71678322 0.74216028
|
|
0.77700348 0.75174825 0.72027972 0.74475524]
|
|
|
|
mean value: 0.7498184742087182
|
|
|
|
key: test_roc_auc
|
|
value: [0.71875 0.67613636 0.71022727 0.69460227 0.81818182 0.76869501
|
|
0.69758065 0.75872093 0.76598837 0.73510174]
|
|
|
|
mean value: 0.7343984433608403
|
|
|
|
key: train_roc_auc
|
|
value: [0.79842168 0.80016993 0.80795922 0.79556576 0.783328 0.79219973
|
|
0.80326001 0.79719392 0.7865358 0.79369742]
|
|
|
|
mean value: 0.7958331466379481
|
|
|
|
key: test_jcc
|
|
value: [0.51162791 0.45454545 0.48780488 0.46341463 0.64864865 0.575
|
|
0.47619048 0.57142857 0.55555556 0.525 ]
|
|
|
|
mean value: 0.5269216125540572
|
|
|
|
key: train_jcc
|
|
value: [0.62108262 0.62393162 0.63610315 0.61449275 0.5942029 0.61031519
|
|
0.6299435 0.61781609 0.59883721 0.61206897]
|
|
|
|
mean value: 0.6158794004895489
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.70295882 2.80787826 1.40164876 2.38740396 2.44509292 2.75604963
|
|
2.07706332 2.70040226 1.80623651 2.78324103]
|
|
|
|
mean value: 2.286797547340393
|
|
|
|
key: score_time
|
|
value: [0.01258898 0.01494551 0.01265264 0.01272011 0.0129509 0.01516461
|
|
0.01273394 0.01545763 0.01267385 0.01545882]
|
|
|
|
mean value: 0.013734698295593262
|
|
|
|
key: test_mcc
|
|
value: [0.60895504 0.37247785 0.48064296 0.34721981 0.48240733 0.48661327
|
|
0.55985938 0.47378743 0.5603175 0.45993751]
|
|
|
|
mean value: 0.48322180837308737
|
|
|
|
key: train_mcc
|
|
value: [0.85287344 0.93046902 0.79700491 0.89204246 0.93708923 0.95183175
|
|
0.85638717 0.96111937 0.86728304 0.92523004]
|
|
|
|
mean value: 0.897133043483631
|
|
|
|
key: test_accuracy
|
|
value: [0.80263158 0.69736842 0.75 0.68421053 0.75 0.74666667
|
|
0.76 0.73333333 0.78666667 0.73333333]
|
|
|
|
mean value: 0.744421052631579
|
|
|
|
key: train_accuracy
|
|
value: [0.9263623 0.96612666 0.90132548 0.9455081 0.96907216 0.97647059
|
|
0.92794118 0.98088235 0.93382353 0.96323529]
|
|
|
|
mean value: 0.9490747639261891
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.62295082 0.6779661 0.61290323 0.65454545 0.70769231
|
|
0.75675676 0.71428571 0.73333333 0.6969697 ]
|
|
|
|
mean value: 0.6960012106408936
|
|
|
|
key: train_fscore
|
|
value: [0.91610738 0.95943563 0.88057041 0.93802345 0.96229803 0.97222222
|
|
0.91819699 0.9775475 0.92411467 0.95697074]
|
|
|
|
mean value: 0.9405487018518649
|
|
|
|
key: test_precision
|
|
value: [0.72972973 0.65517241 0.74074074 0.63333333 0.7826087 0.67647059
|
|
0.65116279 0.65789474 0.78571429 0.67647059]
|
|
|
|
mean value: 0.6989297902973735
|
|
|
|
key: train_precision
|
|
value: [0.88064516 0.96797153 0.89818182 0.90032154 0.98892989 0.96885813
|
|
0.88141026 0.96587031 0.89250814 0.94237288]
|
|
|
|
mean value: 0.9287069662172294
|
|
|
|
key: test_recall
|
|
value: [0.84375 0.59375 0.625 0.59375 0.5625 0.74193548
|
|
0.90322581 0.78125 0.6875 0.71875 ]
|
|
|
|
mean value: 0.705141129032258
|
|
|
|
key: train_recall
|
|
value: [0.95454545 0.95104895 0.86363636 0.97902098 0.93706294 0.97560976
|
|
0.95818815 0.98951049 0.95804196 0.97202797]
|
|
|
|
mean value: 0.953869301430277
|
|
|
|
key: test_roc_auc
|
|
value: [0.80823864 0.68323864 0.73295455 0.671875 0.72443182 0.74596774
|
|
0.78115836 0.73946221 0.77398256 0.73146802]
|
|
|
|
mean value: 0.7392777526768055
|
|
|
|
key: train_roc_auc
|
|
value: [0.93019894 0.96407409 0.89619477 0.95007029 0.96471467 0.9763545
|
|
0.93202029 0.98206489 0.93714281 0.96444038]
|
|
|
|
mean value: 0.9497275621991028
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.45238095 0.51282051 0.44186047 0.48648649 0.54761905
|
|
0.60869565 0.55555556 0.57894737 0.53488372]
|
|
|
|
mean value: 0.5362106904361175
|
|
|
|
key: train_jcc
|
|
value: [0.84520124 0.9220339 0.7866242 0.88328076 0.92733564 0.94594595
|
|
0.84876543 0.95608108 0.85893417 0.91749175]
|
|
|
|
mean value: 0.889169411533274
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05610561 0.04624677 0.04446769 0.04397821 0.04307437 0.04261947
|
|
0.04143143 0.04674125 0.04775262 0.04000306]
|
|
|
|
mean value: 0.04524204730987549
|
|
|
|
key: score_time
|
|
value: [0.00974059 0.00936556 0.00917125 0.00931931 0.00939131 0.00995302
|
|
0.00941157 0.00973129 0.01017833 0.00991678]
|
|
|
|
mean value: 0.009617900848388672
|
|
|
|
key: test_mcc
|
|
value: [0.59192216 0.57868822 0.67069242 0.64678324 0.59192216 0.64978463
|
|
0.67008798 0.62239581 0.726372 0.78140018]
|
|
|
|
mean value: 0.6530048787782348
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.80263158 0.78947368 0.82894737 0.82894737 0.80263158 0.82666667
|
|
0.84 0.81333333 0.86666667 0.89333333]
|
|
|
|
mean value: 0.8292631578947368
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75409836 0.76470588 0.81690141 0.78688525 0.75409836 0.8
|
|
0.80645161 0.78787879 0.83870968 0.87096774]
|
|
|
|
mean value: 0.7980697078153612
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.79310345 0.72222222 0.74358974 0.82758621 0.79310345 0.76470588
|
|
0.80645161 0.76470588 0.86666667 0.9 ]
|
|
|
|
mean value: 0.7982135113536016
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71875 0.8125 0.90625 0.75 0.71875 0.83870968
|
|
0.80645161 0.8125 0.8125 0.84375 ]
|
|
|
|
mean value: 0.802016129032258
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.79119318 0.79261364 0.83948864 0.81818182 0.79119318 0.82844575
|
|
0.83504399 0.81322674 0.85973837 0.88699128]
|
|
|
|
mean value: 0.8256116585964672
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.60526316 0.61904762 0.69047619 0.64864865 0.60526316 0.66666667
|
|
0.67567568 0.65 0.72222222 0.77142857]
|
|
|
|
mean value: 0.6654691909955068
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17381358 0.16700077 0.16607237 0.17023063 0.17047739 0.17239523
|
|
0.16992283 0.16555643 0.16628838 0.17819834]
|
|
|
|
mean value: 0.1699955940246582
|
|
|
|
key: score_time
|
|
value: [0.02001786 0.01942492 0.02027106 0.02043462 0.02041912 0.0209651
|
|
0.0195148 0.01986623 0.02057624 0.02045608]
|
|
|
|
mean value: 0.020194602012634278
|
|
|
|
key: test_mcc
|
|
value: [0.56530828 0.29269769 0.4822 0.36720508 0.51078616 0.36478009
|
|
0.53058923 0.41225113 0.6184593 0.45494186]
|
|
|
|
mean value: 0.45992188220568675
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.65789474 0.75 0.69736842 0.76315789 0.69333333
|
|
0.77333333 0.70666667 0.81333333 0.73333333]
|
|
|
|
mean value: 0.7377894736842106
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.74193548 0.58064516 0.68852459 0.59649123 0.70967742 0.62295082
|
|
0.72131148 0.67647059 0.78125 0.6875 ]
|
|
|
|
mean value: 0.68067567660675
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76666667 0.6 0.72413793 0.68 0.73333333 0.63333333
|
|
0.73333333 0.63888889 0.78125 0.6875 ]
|
|
|
|
mean value: 0.6978443486590038
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71875 0.5625 0.65625 0.53125 0.6875 0.61290323
|
|
0.70967742 0.71875 0.78125 0.6875 ]
|
|
|
|
mean value: 0.666633064516129
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.77982955 0.64488636 0.73721591 0.67471591 0.75284091 0.68145161
|
|
0.76392962 0.70821221 0.80922965 0.72747093]
|
|
|
|
mean value: 0.7279782658732865
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.58974359 0.40909091 0.525 0.425 0.55 0.45238095
|
|
0.56410256 0.51111111 0.64102564 0.52380952]
|
|
|
|
mean value: 0.5191264291264291
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.46
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01297903 0.01341438 0.01251864 0.01181912 0.01186943 0.0112195
|
|
0.01164675 0.0116322 0.0120945 0.01247215]
|
|
|
|
mean value: 0.012166571617126466
|
|
|
|
key: score_time
|
|
value: [0.0102303 0.00969195 0.00987315 0.00981855 0.00973344 0.0094564
|
|
0.00986338 0.00978637 0.00979996 0.00987363]
|
|
|
|
mean value: 0.009812712669372559
|
|
|
|
key: test_mcc
|
|
value: [0.21405867 0.22073036 0.12913133 0.48240733 0.4040992 0.16607219
|
|
0.13749357 0.33503026 0.17609018 0.04233617]
|
|
|
|
mean value: 0.23074492714957756
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.61842105 0.61842105 0.57894737 0.75 0.69736842 0.57333333
|
|
0.57333333 0.66666667 0.6 0.53333333]
|
|
|
|
mean value: 0.6209824561403509
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.53968254 0.55384615 0.48387097 0.65454545 0.67605634 0.55555556
|
|
0.51515152 0.63768116 0.51612903 0.44444444]
|
|
|
|
mean value: 0.5576963160674122
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5483871 0.54545455 0.5 0.7826087 0.61538462 0.48780488
|
|
0.48571429 0.59459459 0.53333333 0.4516129 ]
|
|
|
|
mean value: 0.5544894948182328
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.53125 0.5625 0.46875 0.5625 0.75 0.64516129
|
|
0.5483871 0.6875 0.5 0.4375 ]
|
|
|
|
mean value: 0.5693548387096774
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.60653409 0.61079545 0.56392045 0.72443182 0.70454545 0.58394428
|
|
0.56964809 0.6693314 0.5872093 0.52107558]
|
|
|
|
mean value: 0.614143592716361
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.36956522 0.38297872 0.31914894 0.48648649 0.5106383 0.38461538
|
|
0.34693878 0.46808511 0.34782609 0.28571429]
|
|
|
|
mean value: 0.39019973005039743
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.52887535 2.54070401 2.48067498 2.40389156 2.36476159 2.3751862
|
|
2.35466981 2.37321854 2.36698627 2.35058093]
|
|
|
|
mean value: 2.4139549255371096
|
|
|
|
key: score_time
|
|
value: [0.10534024 0.10218143 0.09882855 0.10170054 0.09665203 0.09587097
|
|
0.0963614 0.09620166 0.1036911 0.09608531]
|
|
|
|
mean value: 0.09929132461547852
|
|
|
|
key: test_mcc
|
|
value: [0.67460105 0.59365605 0.73881068 0.56410605 0.70463922 0.78329779
|
|
0.75402183 0.7663997 0.83648256 0.81028771]
|
|
|
|
mean value: 0.7226302653782992
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.80263158 0.86842105 0.78947368 0.85526316 0.89333333
|
|
0.88 0.88 0.92 0.90666667]
|
|
|
|
mean value: 0.8637894736842106
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.76190476 0.85294118 0.73333333 0.83076923 0.875
|
|
0.85714286 0.86956522 0.90625 0.89230769]
|
|
|
|
mean value: 0.8379214269319768
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.77419355 0.80555556 0.78571429 0.81818182 0.84848485
|
|
0.84375 0.81081081 0.90625 0.87878788]
|
|
|
|
mean value: 0.8328871603065151
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.90625 0.6875 0.84375 0.90322581
|
|
0.87096774 0.9375 0.90625 0.90625 ]
|
|
|
|
mean value: 0.8461693548387097
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.79545455 0.87357955 0.77556818 0.85369318 0.89479472
|
|
0.87866569 0.88735465 0.91824128 0.90661337]
|
|
|
|
mean value: 0.8613510621973675
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.61538462 0.74358974 0.57894737 0.71052632 0.77777778
|
|
0.75 0.76923077 0.82857143 0.80555556]
|
|
|
|
mean value: 0.7246250240987083
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [2.00792384 1.07964015 1.14048624 1.0987649 1.05054617 1.07768703
|
|
1.08028054 1.06092668 1.1231184 1.1270175 ]
|
|
|
|
mean value: 1.1846391439437867
|
|
|
|
key: score_time
|
|
value: [0.25205517 0.28182316 0.22127581 0.29803705 0.28213239 0.29165244
|
|
0.27522254 0.27276921 0.28709555 0.27418542]
|
|
|
|
mean value: 0.2736248731613159
|
|
|
|
key: test_mcc
|
|
value: [0.67460105 0.59365605 0.70914208 0.61935355 0.73011364 0.83784499
|
|
0.78005865 0.81415147 0.80876688 0.81028771]
|
|
|
|
mean value: 0.7377976067565862
|
|
|
|
key: train_mcc
|
|
value: [0.90938451 0.90326789 0.89730244 0.90962285 0.91874398 0.90651302
|
|
0.89762206 0.90958458 0.90344681 0.89741223]
|
|
|
|
mean value: 0.9052900373277607
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.80263158 0.85526316 0.81578947 0.86842105 0.92
|
|
0.89333333 0.90666667 0.90666667 0.90666667]
|
|
|
|
mean value: 0.8717543859649123
|
|
|
|
key: train_accuracy
|
|
value: [0.95581738 0.95287187 0.94992636 0.95581738 0.96023564 0.95441176
|
|
0.95 0.95588235 0.95294118 0.95 ]
|
|
|
|
mean value: 0.953790392445638
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.76190476 0.8358209 0.76666667 0.84375 0.90625
|
|
0.87096774 0.89552239 0.88888889 0.89230769]
|
|
|
|
mean value: 0.8462079035285583
|
|
|
|
key: train_fscore
|
|
value: [0.94755245 0.94385965 0.94055944 0.94791667 0.95320624 0.94589878
|
|
0.94097222 0.94773519 0.94405594 0.94055944]
|
|
|
|
mean value: 0.9452316019904221
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.77419355 0.8 0.82142857 0.84375 0.87878788
|
|
0.87096774 0.85714286 0.90322581 0.87878788]
|
|
|
|
mean value: 0.8485427140064237
|
|
|
|
key: train_precision
|
|
value: [0.94755245 0.9471831 0.94055944 0.94137931 0.94501718 0.94755245
|
|
0.93771626 0.94444444 0.94405594 0.94055944]
|
|
|
|
mean value: 0.9436020018766904
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.875 0.71875 0.84375 0.93548387
|
|
0.87096774 0.9375 0.875 0.90625 ]
|
|
|
|
mean value: 0.8462701612903226
|
|
|
|
key: train_recall
|
|
value: [0.94755245 0.94055944 0.94055944 0.95454545 0.96153846 0.94425087
|
|
0.94425087 0.95104895 0.94405594 0.94055944]
|
|
|
|
mean value: 0.946892132257986
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.79545455 0.85795455 0.80255682 0.86505682 0.92228739
|
|
0.89002933 0.91061047 0.90261628 0.90661337]
|
|
|
|
mean value: 0.8682725013639774
|
|
|
|
key: train_roc_auc
|
|
value: [0.95469225 0.95119575 0.94865122 0.95564423 0.960413 0.95304147
|
|
0.94922467 0.95521991 0.9517234 0.94870612]
|
|
|
|
mean value: 0.952851201686529
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.61538462 0.71794872 0.62162162 0.72972973 0.82857143
|
|
0.77142857 0.81081081 0.8 0.80555556]
|
|
|
|
mean value: 0.7367717717717718
|
|
|
|
key: train_jcc
|
|
value: [0.90033223 0.89368771 0.88778878 0.9009901 0.91059603 0.89735099
|
|
0.88852459 0.90066225 0.89403974 0.88778878]
|
|
|
|
mean value: 0.8961761187106945
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.03019071 0.01226044 0.01226664 0.01272726 0.01263142 0.01290274
|
|
0.01258135 0.01262927 0.01268077 0.01258755]
|
|
|
|
mean value: 0.014345812797546386
|
|
|
|
key: score_time
|
|
value: [0.01134753 0.00947285 0.01033521 0.01011801 0.01010323 0.01012897
|
|
0.01017475 0.0101254 0.0101676 0.01010561]
|
|
|
|
mean value: 0.010207915306091308
|
|
|
|
key: test_mcc
|
|
value: [0.59365605 0.28140559 0.17447146 0.48519965 0.48956862 0.46313625
|
|
0.35577952 0.38895144 0.5603175 0.25810271]
|
|
|
|
mean value: 0.40505887826745546
|
|
|
|
key: train_mcc
|
|
value: [0.432963 0.46234808 0.46694919 0.46570418 0.44960003 0.44154422
|
|
0.43213035 0.44706869 0.45428806 0.45555535]
|
|
|
|
mean value: 0.450815115609883
|
|
|
|
key: test_accuracy
|
|
value: [0.80263158 0.64473684 0.59210526 0.75 0.75 0.73333333
|
|
0.69333333 0.69333333 0.78666667 0.62666667]
|
|
|
|
mean value: 0.707280701754386
|
|
|
|
key: train_accuracy
|
|
value: [0.72164948 0.73784978 0.73784978 0.73784978 0.72901325 0.72647059
|
|
0.71911765 0.72941176 0.73235294 0.73235294]
|
|
|
|
mean value: 0.7303917958936151
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.59701493 0.53731343 0.6984127 0.70769231 0.6969697
|
|
0.59649123 0.66666667 0.73333333 0.6 ]
|
|
|
|
mean value: 0.6595799051258595
|
|
|
|
key: train_fscore
|
|
value: [0.67692308 0.68881119 0.69727891 0.69520548 0.68813559 0.68041237
|
|
0.68113523 0.68275862 0.68835616 0.69047619]
|
|
|
|
mean value: 0.686949282203034
|
|
|
|
key: test_precision
|
|
value: [0.77419355 0.57142857 0.51428571 0.70967742 0.6969697 0.65714286
|
|
0.65384615 0.62162162 0.78571429 0.55263158]
|
|
|
|
mean value: 0.6537511447698205
|
|
|
|
key: train_precision
|
|
value: [0.66220736 0.68881119 0.67880795 0.68120805 0.66776316 0.67118644
|
|
0.65384615 0.67346939 0.67449664 0.67218543]
|
|
|
|
mean value: 0.67239817623147
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.5625 0.6875 0.71875 0.74193548
|
|
0.5483871 0.71875 0.6875 0.65625 ]
|
|
|
|
mean value: 0.6696572580645161
|
|
|
|
key: train_recall
|
|
value: [0.69230769 0.68881119 0.71678322 0.70979021 0.70979021 0.68989547
|
|
0.71080139 0.69230769 0.7027972 0.70979021]
|
|
|
|
mean value: 0.702307448648912
|
|
|
|
key: test_roc_auc
|
|
value: [0.79545455 0.64204545 0.58806818 0.74147727 0.74573864 0.73460411
|
|
0.67192082 0.6965843 0.77398256 0.63045058]
|
|
|
|
mean value: 0.7020326459455773
|
|
|
|
key: train_roc_auc
|
|
value: [0.71765512 0.73117404 0.73498194 0.73402996 0.72639638 0.72153807
|
|
0.71799612 0.72432643 0.72830215 0.72926059]
|
|
|
|
mean value: 0.7265660801452282
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.42553191 0.36734694 0.53658537 0.54761905 0.53488372
|
|
0.425 0.5 0.57894737 0.42857143]
|
|
|
|
mean value: 0.4959870400449163
|
|
|
|
key: train_jcc
|
|
value: [0.51162791 0.52533333 0.53524804 0.5328084 0.5245478 0.515625
|
|
0.5164557 0.51832461 0.52480418 0.52727273]
|
|
|
|
mean value: 0.523204769300403
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.15459466 0.12359762 0.13257623 0.11857748 0.12010121 0.12283087
|
|
0.11335015 0.26927233 0.12270784 0.1195004 ]
|
|
|
|
mean value: 0.1397108793258667
|
|
|
|
key: score_time
|
|
value: [0.01213288 0.01233554 0.0116086 0.01127672 0.01136041 0.01135826
|
|
0.01148677 0.01124954 0.01111746 0.01170921]
|
|
|
|
mean value: 0.011563539505004883
|
|
|
|
key: test_mcc
|
|
value: [0.81056883 0.64788424 0.86594218 0.6198304 0.75650539 0.83784499
|
|
0.80876688 0.78485412 0.86351193 0.78485412]
|
|
|
|
mean value: 0.778056307737122
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90789474 0.82894737 0.93421053 0.81578947 0.88157895 0.92
|
|
0.90666667 0.89333333 0.93333333 0.89333333]
|
|
|
|
mean value: 0.8915087719298246
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.79365079 0.92307692 0.77419355 0.85245902 0.90625
|
|
0.88888889 0.87878788 0.92063492 0.87878788]
|
|
|
|
mean value: 0.8705618737496712
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.80645161 0.90909091 0.8 0.89655172 0.87878788
|
|
0.875 0.85294118 0.93548387 0.85294118]
|
|
|
|
mean value: 0.8710474155280475
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.78125 0.9375 0.75 0.8125 0.93548387
|
|
0.90322581 0.90625 0.90625 0.90625 ]
|
|
|
|
mean value: 0.8713709677419355
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90340909 0.82244318 0.93465909 0.80681818 0.87215909 0.92228739
|
|
0.90615836 0.89498547 0.92986919 0.89498547]
|
|
|
|
mean value: 0.8887774500443293
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.65789474 0.85714286 0.63157895 0.74285714 0.82857143
|
|
0.8 0.78378378 0.85294118 0.78378378]
|
|
|
|
mean value: 0.7738553856820111
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05141187 0.09285474 0.0785737 0.09084392 0.08712173 0.07668972
|
|
0.08316016 0.08379889 0.08368492 0.08396983]
|
|
|
|
mean value: 0.08121094703674317
|
|
|
|
key: score_time
|
|
value: [0.01939702 0.02001953 0.01978993 0.01961112 0.02019668 0.01976013
|
|
0.0197196 0.01980257 0.01976991 0.0196898 ]
|
|
|
|
mean value: 0.0197756290435791
|
|
|
|
key: test_mcc
|
|
value: [0.47970161 0.46022727 0.48956862 0.41185791 0.51905381 0.6340079
|
|
0.54846768 0.52770861 0.56395349 0.57412984]
|
|
|
|
mean value: 0.5208676744130143
|
|
|
|
key: train_mcc
|
|
value: [0.75228604 0.76267653 0.75505868 0.75783987 0.76384416 0.73844324
|
|
0.73034032 0.74334134 0.7373458 0.75579488]
|
|
|
|
mean value: 0.7496970872434735
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.73684211 0.75 0.71052632 0.76315789 0.81333333
|
|
0.77333333 0.76 0.78666667 0.78666667]
|
|
|
|
mean value: 0.7617368421052632
|
|
|
|
key: train_accuracy
|
|
value: [0.87776141 0.88365243 0.87923417 0.88070692 0.88365243 0.87205882
|
|
0.86617647 0.87352941 0.87058824 0.87941176]
|
|
|
|
mean value: 0.8766772069652603
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.6875 0.70769231 0.66666667 0.72727273 0.79411765
|
|
0.74626866 0.74285714 0.75 0.76470588]
|
|
|
|
mean value: 0.7301366744902742
|
|
|
|
key: train_fscore
|
|
value: [0.85908319 0.86402754 0.86054422 0.86201022 0.86541738 0.84974093
|
|
0.84757119 0.8537415 0.85034014 0.86101695]
|
|
|
|
mean value: 0.8573493249947532
|
|
|
|
key: test_precision
|
|
value: [0.65789474 0.6875 0.6969697 0.64705882 0.70588235 0.72972973
|
|
0.69444444 0.68421053 0.75 0.72222222]
|
|
|
|
mean value: 0.6975912532994577
|
|
|
|
key: train_precision
|
|
value: [0.8349835 0.85084746 0.83774834 0.84053156 0.84385382 0.84246575
|
|
0.81612903 0.83112583 0.82781457 0.83552632]
|
|
|
|
mean value: 0.8361026181230804
|
|
|
|
key: test_recall
|
|
value: [0.78125 0.6875 0.71875 0.6875 0.75 0.87096774
|
|
0.80645161 0.8125 0.75 0.8125 ]
|
|
|
|
mean value: 0.7677419354838709
|
|
|
|
key: train_recall
|
|
value: [0.88461538 0.87762238 0.88461538 0.88461538 0.88811189 0.85714286
|
|
0.8815331 0.87762238 0.87412587 0.88811189]
|
|
|
|
mean value: 0.8798116517628712
|
|
|
|
key: test_roc_auc
|
|
value: [0.74289773 0.73011364 0.74573864 0.70738636 0.76136364 0.82184751
|
|
0.77822581 0.76671512 0.78197674 0.78997093]
|
|
|
|
mean value: 0.7626236104480666
|
|
|
|
key: train_roc_auc
|
|
value: [0.87869446 0.88283155 0.87996673 0.88123899 0.88425951 0.87004726
|
|
0.86824747 0.87409038 0.87107309 0.88060417]
|
|
|
|
mean value: 0.8771053583080383
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.52380952 0.54761905 0.5 0.57142857 0.65853659
|
|
0.5952381 0.59090909 0.6 0.61904762]
|
|
|
|
mean value: 0.5762144088973358
|
|
|
|
key: train_jcc
|
|
value: [0.75297619 0.76060606 0.75522388 0.75748503 0.76276276 0.73873874
|
|
0.73546512 0.74480712 0.73964497 0.75595238]
|
|
|
|
mean value: 0.750366225242826
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01528907 0.0125401 0.0123477 0.01201797 0.01186919 0.01119399
|
|
0.01182628 0.01187348 0.0116477 0.01158071]
|
|
|
|
mean value: 0.012218618392944336
|
|
|
|
key: score_time
|
|
value: [0.01129532 0.01013231 0.00990796 0.0098362 0.00951409 0.00974369
|
|
0.00975084 0.00975108 0.00947189 0.00925994]
|
|
|
|
mean value: 0.0098663330078125
|
|
|
|
key: test_mcc
|
|
value: [0.35825997 0.38203331 0.36519159 0.39836355 0.59365605 0.5716838
|
|
0.33434239 0.49312416 0.5930524 0.36594507]
|
|
|
|
mean value: 0.4455652298405159
|
|
|
|
key: train_mcc
|
|
value: [0.47425208 0.45368292 0.47959222 0.49692127 0.46628488 0.46594407
|
|
0.46798304 0.46220361 0.47219115 0.45022029]
|
|
|
|
mean value: 0.4689275528576301
|
|
|
|
key: test_accuracy
|
|
value: [0.68421053 0.69736842 0.68421053 0.71052632 0.80263158 0.78666667
|
|
0.68 0.73333333 0.8 0.68 ]
|
|
|
|
mean value: 0.7258947368421053
|
|
|
|
key: train_accuracy
|
|
value: [0.74079529 0.73195876 0.7437408 0.75110457 0.73637703 0.73676471
|
|
0.73676471 0.73529412 0.73970588 0.72941176]
|
|
|
|
mean value: 0.7381917612405787
|
|
|
|
key: test_fscore
|
|
value: [0.63636364 0.64615385 0.64705882 0.63333333 0.76190476 0.75757576
|
|
0.6 0.72972973 0.76923077 0.65714286]
|
|
|
|
mean value: 0.6838493514964104
|
|
|
|
key: train_fscore
|
|
value: [0.7027027 0.68835616 0.70508475 0.71691792 0.69915966 0.69814503
|
|
0.70116861 0.69491525 0.70151771 0.68813559]
|
|
|
|
mean value: 0.6996103393349323
|
|
|
|
key: test_precision
|
|
value: [0.61764706 0.63636364 0.61111111 0.67857143 0.77419355 0.71428571
|
|
0.62068966 0.64285714 0.75757576 0.60526316]
|
|
|
|
mean value: 0.6658558211042568
|
|
|
|
key: train_precision
|
|
value: [0.67973856 0.67449664 0.68421053 0.68810289 0.67313916 0.67647059
|
|
0.67307692 0.67434211 0.67752443 0.66776316]
|
|
|
|
mean value: 0.6768864989606861
|
|
|
|
key: test_recall
|
|
value: [0.65625 0.65625 0.6875 0.59375 0.75 0.80645161
|
|
0.58064516 0.84375 0.78125 0.71875 ]
|
|
|
|
mean value: 0.7074596774193549
|
|
|
|
key: train_recall
|
|
value: [0.72727273 0.7027972 0.72727273 0.74825175 0.72727273 0.72125436
|
|
0.73170732 0.71678322 0.72727273 0.70979021]
|
|
|
|
mean value: 0.7239674959187155
|
|
|
|
key: test_roc_auc
|
|
value: [0.68039773 0.69176136 0.68465909 0.69460227 0.79545455 0.78958944
|
|
0.66532258 0.7474564 0.79760174 0.6849564 ]
|
|
|
|
mean value: 0.7231801558344132
|
|
|
|
key: train_roc_auc
|
|
value: [0.73895443 0.72798893 0.74149896 0.7507162 0.73513764 0.73467298
|
|
0.73608267 0.73275709 0.73800185 0.72672252]
|
|
|
|
mean value: 0.7362533259808247
|
|
|
|
key: test_jcc
|
|
value: [0.46666667 0.47727273 0.47826087 0.46341463 0.61538462 0.6097561
|
|
0.42857143 0.57446809 0.625 0.4893617 ]
|
|
|
|
mean value: 0.5228156826402015
|
|
|
|
key: train_jcc
|
|
value: [0.54166667 0.52480418 0.54450262 0.55874674 0.5374677 0.53626943
|
|
0.53984576 0.53246753 0.54025974 0.5245478 ]
|
|
|
|
mean value: 0.5380578163315645
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02463841 0.03144503 0.02579641 0.02777719 0.02671218 0.02404237
|
|
0.02409434 0.0281806 0.02527308 0.02363944]
|
|
|
|
mean value: 0.026159906387329103
|
|
|
|
key: score_time
|
|
value: [0.01157069 0.01227355 0.01218486 0.01215577 0.01218534 0.01224351
|
|
0.0121696 0.01213717 0.01212764 0.01214361]
|
|
|
|
mean value: 0.012119174003601074
|
|
|
|
key: test_mcc
|
|
value: [0.2763854 0.34311605 0.56410605 0.44912659 0.6053757 0.49865921
|
|
0.64112865 0.30980985 0.42806382 0.22589682]
|
|
|
|
mean value: 0.4341668135250491
|
|
|
|
key: train_mcc
|
|
value: [0.24210359 0.65038224 0.62574954 0.52435929 0.62830318 0.6292708
|
|
0.62197849 0.24323178 0.40003685 0.31982482]
|
|
|
|
mean value: 0.4885240589433125
|
|
|
|
key: test_accuracy
|
|
value: [0.63157895 0.68421053 0.78947368 0.65789474 0.80263158 0.76
|
|
0.82666667 0.64 0.69333333 0.62666667]
|
|
|
|
mean value: 0.7112456140350878
|
|
|
|
key: train_accuracy
|
|
value: [0.62150221 0.83063328 0.81885125 0.69955817 0.8173785 0.81911765
|
|
0.81617647 0.62058824 0.69558824 0.64852941]
|
|
|
|
mean value: 0.7387923416789396
|
|
|
|
key: test_fscore
|
|
value: [0.22222222 0.6 0.73333333 0.70454545 0.71698113 0.68965517
|
|
0.78688525 0.27027027 0.43902439 0.3 ]
|
|
|
|
mean value: 0.5462917221006087
|
|
|
|
key: train_fscore
|
|
value: [0.18927445 0.78743068 0.77348066 0.7357513 0.752 0.76116505
|
|
0.76007678 0.17834395 0.46233766 0.28228228]
|
|
|
|
mean value: 0.568214280782849
|
|
|
|
key: test_precision
|
|
value: [1. 0.64285714 0.78571429 0.55357143 0.9047619 0.74074074
|
|
0.8 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8177645502645503
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.83529412 0.81712062 0.58436214 0.87850467 0.85964912
|
|
0.84615385 1. 0.8989899 1. ]
|
|
|
|
mean value: 0.8687816356464677
|
|
|
|
key: test_recall
|
|
value: [0.125 0.5625 0.6875 0.96875 0.59375 0.64516129
|
|
0.77419355 0.15625 0.28125 0.1875 ]
|
|
|
|
mean value: 0.4981854838709677
|
|
|
|
key: train_recall
|
|
value: [0.1048951 0.74475524 0.73426573 0.99300699 0.65734266 0.68292683
|
|
0.68989547 0.0979021 0.31118881 0.16433566]
|
|
|
|
mean value: 0.5180514607343876
|
|
|
|
key: test_roc_auc
|
|
value: [0.5625 0.66761364 0.77556818 0.70028409 0.77414773 0.74303519
|
|
0.81891496 0.578125 0.640625 0.57049419]
|
|
|
|
mean value: 0.6831307969037714
|
|
|
|
key: train_roc_auc
|
|
value: [0.55117529 0.81894251 0.80733643 0.73950604 0.79559245 0.80075095
|
|
0.79914621 0.54895105 0.64290405 0.58216783]
|
|
|
|
mean value: 0.7086472800759291
|
|
|
|
key: test_jcc
|
|
value: [0.125 0.42857143 0.57894737 0.54385965 0.55882353 0.52631579
|
|
0.64864865 0.15625 0.28125 0.17647059]
|
|
|
|
mean value: 0.402413700188468
|
|
|
|
key: train_jcc
|
|
value: [0.10452962 0.64939024 0.63063063 0.58196721 0.6025641 0.61442006
|
|
0.6130031 0.0979021 0.30067568 0.16433566]
|
|
|
|
mean value: 0.43594184035212596
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0311873 0.03052235 0.0290103 0.03574967 0.03354788 0.02688074
|
|
0.03113914 0.02523851 0.03100848 0.02994823]
|
|
|
|
mean value: 0.03042325973510742
|
|
|
|
key: score_time
|
|
value: [0.01227045 0.0122273 0.01223731 0.01219487 0.01226044 0.01223755
|
|
0.01211143 0.01226664 0.01223755 0.01227546]
|
|
|
|
mean value: 0.012231898307800294
|
|
|
|
key: test_mcc
|
|
value: [0.51340315 0.45464014 0.33711412 0.48064296 0.44756059 0.52646266
|
|
0.50718127 0.55985938 0.63555097 0.48272697]
|
|
|
|
mean value: 0.4945142231717486
|
|
|
|
key: train_mcc
|
|
value: [0.6517648 0.64055475 0.60314556 0.70913466 0.48594004 0.60604618
|
|
0.64341812 0.6951258 0.66005565 0.6439389 ]
|
|
|
|
mean value: 0.6339124456528761
|
|
|
|
key: test_accuracy
|
|
value: [0.76315789 0.73684211 0.68421053 0.75 0.72368421 0.77333333
|
|
0.76 0.76 0.81333333 0.73333333]
|
|
|
|
mean value: 0.7497894736842106
|
|
|
|
key: train_accuracy
|
|
value: [0.82474227 0.82326951 0.80559647 0.85861561 0.73048601 0.80735294
|
|
0.82205882 0.83382353 0.81911765 0.80735294]
|
|
|
|
mean value: 0.8132415749805076
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.62962963 0.55555556 0.6779661 0.55319149 0.70175439
|
|
0.64 0.76315789 0.8 0.72222222]
|
|
|
|
mean value: 0.6710143945832445
|
|
|
|
key: train_fscore
|
|
value: [0.75259875 0.76095618 0.73493976 0.82156134 0.53904282 0.74059406
|
|
0.75255624 0.82852807 0.81105991 0.80241327]
|
|
|
|
mean value: 0.7544250396680352
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.77272727 0.68181818 0.74074074 0.86666667 0.76923077
|
|
0.84210526 0.65909091 0.73684211 0.65 ]
|
|
|
|
mean value: 0.7537403726877411
|
|
|
|
key: train_precision
|
|
value: [0.92820513 0.88425926 0.86320755 0.87698413 0.96396396 0.85779817
|
|
0.91089109 0.73190349 0.72328767 0.70557029]
|
|
|
|
mean value: 0.8446070728093572
|
|
|
|
key: test_recall
|
|
value: [0.5625 0.53125 0.46875 0.625 0.40625 0.64516129
|
|
0.51612903 0.90625 0.875 0.8125 ]
|
|
|
|
mean value: 0.6348790322580645
|
|
|
|
key: train_recall
|
|
value: [0.63286713 0.66783217 0.63986014 0.77272727 0.37412587 0.65156794
|
|
0.64111498 0.95454545 0.92307692 0.93006993]
|
|
|
|
mean value: 0.7187787821934164
|
|
|
|
key: test_roc_auc
|
|
value: [0.73579545 0.70880682 0.65482955 0.73295455 0.68039773 0.75439883
|
|
0.72397361 0.7787064 0.82122093 0.7434593 ]
|
|
|
|
mean value: 0.7334543152833662
|
|
|
|
key: train_roc_auc
|
|
value: [0.79862186 0.80210947 0.7830344 0.84692343 0.68197388 0.78634377
|
|
0.79765673 0.85036917 0.83336587 0.82417202]
|
|
|
|
mean value: 0.8004570600754091
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.45945946 0.38461538 0.51282051 0.38235294 0.54054054
|
|
0.47058824 0.61702128 0.66666667 0.56521739]
|
|
|
|
mean value: 0.5099282408473245
|
|
|
|
key: train_jcc
|
|
value: [0.60333333 0.61414791 0.58095238 0.69716088 0.36896552 0.58805031
|
|
0.60327869 0.70725389 0.68217054 0.67002519]
|
|
|
|
mean value: 0.6115338645328594
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.24882793 0.23553991 0.23604679 0.23582315 0.22026038 0.22161198
|
|
0.22113228 0.22108078 0.22125912 0.22034097]
|
|
|
|
mean value: 0.22819232940673828
|
|
|
|
key: score_time
|
|
value: [0.01706672 0.01650715 0.01781344 0.01561666 0.01562548 0.01558304
|
|
0.015558 0.01575947 0.01544976 0.0155189 ]
|
|
|
|
mean value: 0.016049861907958984
|
|
|
|
key: test_mcc
|
|
value: [0.75906419 0.70463922 0.74620251 0.56530828 0.75840687 0.80693778
|
|
0.75856554 0.82032088 0.87424206 0.75597889]
|
|
|
|
mean value: 0.7549666233834014
|
|
|
|
key: train_mcc
|
|
value: [0.88569188 0.8827871 0.88843663 0.88534567 0.90948725 0.88249665
|
|
0.8914807 0.88270706 0.86486766 0.88841788]
|
|
|
|
mean value: 0.8861718479384622
|
|
|
|
key: test_accuracy
|
|
value: [0.88157895 0.85526316 0.86842105 0.78947368 0.88157895 0.90666667
|
|
0.88 0.90666667 0.93333333 0.88 ]
|
|
|
|
mean value: 0.8782982456140351
|
|
|
|
key: train_accuracy
|
|
value: [0.94403535 0.94256259 0.9455081 0.94403535 0.95581738 0.94264706
|
|
0.94705882 0.94264706 0.93382353 0.94558824]
|
|
|
|
mean value: 0.9443723468768951
|
|
|
|
key: test_fscore
|
|
value: [0.84745763 0.83076923 0.85714286 0.74193548 0.86153846 0.8852459
|
|
0.86153846 0.89855072 0.92753623 0.86153846]
|
|
|
|
mean value: 0.8573253441678168
|
|
|
|
key: train_fscore
|
|
value: [0.93425606 0.93264249 0.93565217 0.93379791 0.94773519 0.93217391
|
|
0.93728223 0.93240901 0.92227979 0.93542757]
|
|
|
|
mean value: 0.9343656339425788
|
|
|
|
key: test_precision
|
|
value: [0.92592593 0.81818182 0.78947368 0.76666667 0.84848485 0.9
|
|
0.82352941 0.83783784 0.86486486 0.84848485]
|
|
|
|
mean value: 0.8423449906422042
|
|
|
|
key: train_precision
|
|
value: [0.92465753 0.92150171 0.93079585 0.93055556 0.94444444 0.93055556
|
|
0.93728223 0.92439863 0.9112628 0.93379791]
|
|
|
|
mean value: 0.9289252207474825
|
|
|
|
key: test_recall
|
|
value: [0.78125 0.84375 0.9375 0.71875 0.875 0.87096774
|
|
0.90322581 0.96875 1. 0.875 ]
|
|
|
|
mean value: 0.8774193548387097
|
|
|
|
key: train_recall
|
|
value: [0.94405594 0.94405594 0.94055944 0.93706294 0.95104895 0.93379791
|
|
0.93728223 0.94055944 0.93356643 0.93706294]
|
|
|
|
mean value: 0.939905216734485
|
|
|
|
key: test_roc_auc
|
|
value: [0.86789773 0.85369318 0.87784091 0.77982955 0.88068182 0.90139296
|
|
0.88343109 0.91460756 0.94186047 0.87936047]
|
|
|
|
mean value: 0.8780595717111096
|
|
|
|
key: train_roc_auc
|
|
value: [0.94403815 0.94276589 0.94483443 0.94308618 0.95516824 0.94145366
|
|
0.94574035 0.94236094 0.93378829 0.94441979]
|
|
|
|
mean value: 0.9437655919246752
|
|
|
|
key: test_jcc
|
|
value: [0.73529412 0.71052632 0.75 0.58974359 0.75675676 0.79411765
|
|
0.75675676 0.81578947 0.86486486 0.75675676]
|
|
|
|
mean value: 0.7530606279058292
|
|
|
|
key: train_jcc
|
|
value: [0.87662338 0.87378641 0.87908497 0.87581699 0.90066225 0.87296417
|
|
0.88196721 0.87337662 0.85576923 0.87868852]
|
|
|
|
mean value: 0.876873975806219
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1199789 0.11597323 0.1297648 0.13439751 0.1378653 0.13134599
|
|
0.1068418 0.12545419 0.11151147 0.11002517]
|
|
|
|
mean value: 0.12231583595275879
|
|
|
|
key: score_time
|
|
value: [0.02938819 0.03441763 0.03304958 0.0306685 0.04402232 0.02725291
|
|
0.03886533 0.02465963 0.03792119 0.02057219]
|
|
|
|
mean value: 0.03208174705505371
|
|
|
|
key: test_mcc
|
|
value: [0.73011364 0.59365605 0.81056883 0.61935355 0.75650539 0.78329779
|
|
0.77914379 0.76011486 0.83648256 0.70167006]
|
|
|
|
mean value: 0.737090652577633
|
|
|
|
key: train_mcc
|
|
value: [0.98491637 0.98497426 0.9969826 0.99395897 0.98497426 0.98794895
|
|
0.9909846 0.97297152 0.97588669 0.98793085]
|
|
|
|
mean value: 0.9861529074785003
|
|
|
|
key: test_accuracy
|
|
value: [0.86842105 0.80263158 0.90789474 0.81578947 0.88157895 0.89333333
|
|
0.89333333 0.88 0.92 0.85333333]
|
|
|
|
mean value: 0.8716315789473684
|
|
|
|
key: train_accuracy
|
|
value: [0.99263623 0.99263623 0.99852725 0.99705449 0.99263623 0.99411765
|
|
0.99558824 0.98676471 0.98823529 0.99411765]
|
|
|
|
mean value: 0.9932313956510439
|
|
|
|
key: test_fscore
|
|
value: [0.84375 0.76190476 0.88888889 0.76666667 0.85245902 0.875
|
|
0.86666667 0.86567164 0.90625 0.83076923]
|
|
|
|
mean value: 0.8458026873080702
|
|
|
|
key: train_fscore
|
|
value: [0.99121265 0.99118166 0.99824869 0.9965035 0.99118166 0.99300699
|
|
0.99474606 0.9840708 0.98591549 0.99300699]
|
|
|
|
mean value: 0.9919074487470159
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.77419355 0.90322581 0.82142857 0.89655172 0.84848485
|
|
0.89655172 0.82857143 0.90625 0.81818182]
|
|
|
|
mean value: 0.8537189469781239
|
|
|
|
key: train_precision
|
|
value: [0.99646643 1. 1. 0.9965035 1. 0.99649123
|
|
1. 0.99641577 0.9929078 0.99300699]
|
|
|
|
mean value: 0.997179172070383
|
|
|
|
key: test_recall
|
|
value: [0.84375 0.75 0.875 0.71875 0.8125 0.90322581
|
|
0.83870968 0.90625 0.90625 0.84375 ]
|
|
|
|
mean value: 0.8398185483870968
|
|
|
|
key: train_recall
|
|
value: [0.98601399 0.98251748 0.9965035 0.9965035 0.98251748 0.98954704
|
|
0.98954704 0.97202797 0.97902098 0.99300699]
|
|
|
|
mean value: 0.9867205964766941
|
|
|
|
key: test_roc_auc
|
|
value: [0.86505682 0.79545455 0.90340909 0.80255682 0.87215909 0.89479472
|
|
0.88526393 0.88335756 0.91824128 0.85210756]
|
|
|
|
mean value: 0.8672401410011594
|
|
|
|
key: train_roc_auc
|
|
value: [0.99173473 0.99125874 0.99825175 0.99697948 0.99125874 0.99350125
|
|
0.99477352 0.98474495 0.98697242 0.99396543]
|
|
|
|
mean value: 0.9923441010825366
|
|
|
|
key: test_jcc
|
|
value: [0.72972973 0.61538462 0.8 0.62162162 0.74285714 0.77777778
|
|
0.76470588 0.76315789 0.82857143 0.71052632]
|
|
|
|
mean value: 0.7354332408821573
|
|
|
|
key: train_jcc
|
|
value: [0.9825784 0.98251748 0.9965035 0.99303136 0.98251748 0.98611111
|
|
0.98954704 0.96864111 0.97222222 0.98611111]
|
|
|
|
mean value: 0.9839780815390572
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.29794788 0.33068109 0.32837296 0.22896051 0.25346708 0.25612593
|
|
0.30490828 0.32389164 0.38753223 0.31187224]
|
|
|
|
mean value: 0.30237598419189454
|
|
|
|
key: score_time
|
|
value: [0.03014493 0.03123856 0.03024912 0.01791668 0.01918936 0.01773715
|
|
0.01803207 0.03264809 0.03435659 0.01904559]
|
|
|
|
mean value: 0.025055813789367675
|
|
|
|
key: test_mcc
|
|
value: [0.42733892 0.23817557 0.31048235 0.30837431 0.42733892 0.38584038
|
|
0.44554274 0.37080648 0.44845932 0.40043605]
|
|
|
|
mean value: 0.376279503821083
|
|
|
|
key: train_mcc
|
|
value: [0.91859995 0.93049929 0.91836725 0.92155805 0.91836725 0.93363727
|
|
0.92457091 0.92471388 0.91551596 0.90937403]
|
|
|
|
mean value: 0.9215203827710721
|
|
|
|
key: test_accuracy
|
|
value: [0.72368421 0.63157895 0.67105263 0.67105263 0.72368421 0.70666667
|
|
0.73333333 0.69333333 0.73333333 0.70666667]
|
|
|
|
mean value: 0.6994385964912281
|
|
|
|
key: train_accuracy
|
|
value: [0.96023564 0.96612666 0.96023564 0.96170839 0.96023564 0.96764706
|
|
0.96323529 0.96323529 0.95882353 0.95588235]
|
|
|
|
mean value: 0.9617365502902192
|
|
|
|
key: test_fscore
|
|
value: [0.6557377 0.5483871 0.56140351 0.54545455 0.6557377 0.62068966
|
|
0.66666667 0.63492063 0.64285714 0.65625 ]
|
|
|
|
mean value: 0.6188104660453593
|
|
|
|
key: train_fscore
|
|
value: [0.95304348 0.95971979 0.95254833 0.95470383 0.95254833 0.96153846
|
|
0.95621716 0.95652174 0.95104895 0.9471831 ]
|
|
|
|
mean value: 0.9545073174845851
|
|
|
|
key: test_precision
|
|
value: [0.68965517 0.56666667 0.64 0.65217391 0.68965517 0.66666667
|
|
0.68965517 0.64516129 0.75 0.65625 ]
|
|
|
|
mean value: 0.6645884053940772
|
|
|
|
key: train_precision
|
|
value: [0.94809689 0.96140351 0.95759717 0.95138889 0.95759717 0.96491228
|
|
0.96126761 0.95155709 0.95104895 0.95390071]
|
|
|
|
mean value: 0.9558770269793693
|
|
|
|
key: test_recall
|
|
value: [0.625 0.53125 0.5 0.46875 0.625 0.58064516
|
|
0.64516129 0.625 0.5625 0.65625 ]
|
|
|
|
mean value: 0.5819556451612903
|
|
|
|
key: train_recall
|
|
value: [0.95804196 0.95804196 0.94755245 0.95804196 0.94755245 0.95818815
|
|
0.95121951 0.96153846 0.95104895 0.94055944]
|
|
|
|
mean value: 0.9531785287882849
|
|
|
|
key: test_roc_auc
|
|
value: [0.71022727 0.61789773 0.64772727 0.64346591 0.71022727 0.68804985
|
|
0.72030792 0.68459302 0.71148256 0.70021802]
|
|
|
|
mean value: 0.6834196830457614
|
|
|
|
key: train_roc_auc
|
|
value: [0.95993701 0.96502607 0.95850905 0.96120927 0.95850905 0.96637143
|
|
0.96161485 0.96300273 0.95775798 0.95378226]
|
|
|
|
mean value: 0.9605719693449956
|
|
|
|
key: test_jcc
|
|
value: [0.48780488 0.37777778 0.3902439 0.375 0.48780488 0.45
|
|
0.5 0.46511628 0.47368421 0.48837209]
|
|
|
|
mean value: 0.44958040189337023
|
|
|
|
key: train_jcc
|
|
value: [0.910299 0.92255892 0.90939597 0.91333333 0.90939597 0.92592593
|
|
0.91610738 0.91666667 0.90666667 0.89966555]
|
|
|
|
mean value: 0.91300153991723
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.98012018 1.0008018 0.97443652 0.96889162 0.98432636 0.98858047
|
|
0.97377872 0.96670055 0.95690131 0.9568224 ]
|
|
|
|
mean value: 0.9751359939575195
|
|
|
|
key: score_time
|
|
value: [0.0106864 0.0102756 0.00961518 0.01002574 0.01072812 0.00973248
|
|
0.00965405 0.00952554 0.00958014 0.00956964]
|
|
|
|
mean value: 0.009939289093017578
|
|
|
|
key: test_mcc
|
|
value: [0.83791887 0.70463922 0.86954326 0.73011364 0.72984855 0.86351193
|
|
0.83504399 0.8439277 0.89098837 0.78485412]
|
|
|
|
mean value: 0.8090389648876973
|
|
|
|
key: train_mcc
|
|
value: [0.95467504 0.96074971 0.95774986 0.96375527 0.97886645 0.94879831
|
|
0.97291732 0.96691528 0.95471476 0.95773493]
|
|
|
|
mean value: 0.9616876917488373
|
|
|
|
key: test_accuracy
|
|
value: [0.92105263 0.85526316 0.93421053 0.86842105 0.86842105 0.93333333
|
|
0.92 0.92 0.94666667 0.89333333]
|
|
|
|
mean value: 0.9060701754385965
|
|
|
|
key: train_accuracy
|
|
value: [0.97790869 0.9808542 0.97938144 0.98232695 0.98969072 0.975
|
|
0.98676471 0.98382353 0.97794118 0.97941176]
|
|
|
|
mean value: 0.9813103179416096
|
|
|
|
key: test_fscore
|
|
value: [0.90322581 0.83076923 0.92537313 0.84375 0.83333333 0.92063492
|
|
0.90322581 0.91176471 0.9375 0.87878788]
|
|
|
|
mean value: 0.88883648166393
|
|
|
|
key: train_fscore
|
|
value: [0.9737303 0.97707231 0.97526502 0.97887324 0.98769772 0.97001764
|
|
0.98418278 0.98053097 0.97363796 0.9754386 ]
|
|
|
|
mean value: 0.9776446525287324
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.81818182 0.88571429 0.84375 0.89285714 0.90625
|
|
0.90322581 0.86111111 0.9375 0.85294118]
|
|
|
|
mean value: 0.8834864674119892
|
|
|
|
key: train_precision
|
|
value: [0.9754386 0.98576512 0.98571429 0.9858156 0.99293286 0.98214286
|
|
0.9929078 0.99283154 0.97879859 0.97887324]
|
|
|
|
mean value: 0.9851220497577359
|
|
|
|
key: test_recall
|
|
value: [0.875 0.84375 0.96875 0.84375 0.78125 0.93548387
|
|
0.90322581 0.96875 0.9375 0.90625 ]
|
|
|
|
mean value: 0.8963709677419355
|
|
|
|
key: train_recall
|
|
value: [0.97202797 0.96853147 0.96503497 0.97202797 0.98251748 0.95818815
|
|
0.97560976 0.96853147 0.96853147 0.97202797]
|
|
|
|
mean value: 0.9703028678638435
|
|
|
|
key: test_roc_auc
|
|
value: [0.91477273 0.85369318 0.93892045 0.86505682 0.85653409 0.93365103
|
|
0.91752199 0.92623547 0.94549419 0.89498547]
|
|
|
|
mean value: 0.9046865409534202
|
|
|
|
key: train_roc_auc
|
|
value: [0.97710813 0.97917668 0.97742842 0.98092493 0.98871421 0.97273275
|
|
0.98526035 0.98172766 0.97665152 0.97839977]
|
|
|
|
mean value: 0.9798124432188077
|
|
|
|
key: test_jcc
|
|
value: [0.82352941 0.71052632 0.86111111 0.72972973 0.71428571 0.85294118
|
|
0.82352941 0.83783784 0.88235294 0.78378378]
|
|
|
|
mean value: 0.801962743371412
|
|
|
|
key: train_jcc
|
|
value: [0.94880546 0.95517241 0.95172414 0.95862069 0.97569444 0.94178082
|
|
0.96885813 0.96180556 0.94863014 0.95205479]
|
|
|
|
mean value: 0.956314658704271
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03680634 0.03623056 0.03745699 0.0447278 0.03589249 0.0368638
|
|
0.03617692 0.03619432 0.03663182 0.04341626]
|
|
|
|
mean value: 0.03803973197937012
|
|
|
|
key: score_time
|
|
value: [0.01258159 0.01285815 0.01291418 0.01342416 0.01339483 0.01283622
|
|
0.01341558 0.01345205 0.01280928 0.01292062]
|
|
|
|
mean value: 0.013060665130615235
|
|
|
|
key: test_mcc
|
|
value: [ 0.06511653 -0.01736441 0.01131552 -0.05484543 0.09543651 0.07872809
|
|
0.04790658 0.02114775 -0.00123563 0.2543852 ]
|
|
|
|
mean value: 0.05005907069255838
|
|
|
|
key: train_mcc
|
|
value: [0.23263611 0.26559326 0.22727327 0.25075154 0.2245568 0.22202874
|
|
0.22477213 0.24260216 0.23742053 0.21015426]
|
|
|
|
mean value: 0.23377887989238055
|
|
|
|
key: test_accuracy
|
|
value: [0.46052632 0.43421053 0.43421053 0.42105263 0.47368421 0.44
|
|
0.44 0.45333333 0.44 0.50666667]
|
|
|
|
mean value: 0.4503684210526316
|
|
|
|
key: train_accuracy
|
|
value: [0.4904271 0.5095729 0.48748159 0.50073638 0.48600884 0.48529412
|
|
0.48676471 0.49558824 0.49264706 0.47794118]
|
|
|
|
mean value: 0.49124620982413586
|
|
|
|
key: test_fscore
|
|
value: [0.58585859 0.56565657 0.58252427 0.56 0.59183673 0.58823529
|
|
0.58 0.57731959 0.58 0.63366337]
|
|
|
|
mean value: 0.5845094406136836
|
|
|
|
key: train_fscore
|
|
value: [0.62309368 0.6320442 0.62173913 0.62788145 0.62106406 0.62121212
|
|
0.62188516 0.62513661 0.62377317 0.61704423]
|
|
|
|
mean value: 0.6234873813424298
|
|
|
|
key: test_precision
|
|
value: [0.43283582 0.41791045 0.42253521 0.41176471 0.43939394 0.42253521
|
|
0.42028986 0.43076923 0.42647059 0.46376812]
|
|
|
|
mean value: 0.42882731264872376
|
|
|
|
key: train_precision
|
|
value: [0.45253165 0.46203554 0.4511041 0.4576 0.4503937 0.45054945
|
|
0.45125786 0.45468998 0.45324881 0.44617785]
|
|
|
|
mean value: 0.4529588943309634
|
|
|
|
key: test_recall
|
|
value: [0.90625 0.875 0.9375 0.875 0.90625 0.96774194
|
|
0.93548387 0.875 0.90625 1. ]
|
|
|
|
mean value: 0.9184475806451613
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.52130682 0.49431818 0.50284091 0.48295455 0.53267045 0.51796188
|
|
0.51319648 0.50726744 0.49963663 0.56976744]
|
|
|
|
mean value: 0.5141920778490078
|
|
|
|
key: train_roc_auc
|
|
value: [0.55979644 0.57633588 0.55725191 0.56870229 0.55597964 0.55470738
|
|
0.55597964 0.56472081 0.56218274 0.54949239]
|
|
|
|
mean value: 0.5605149119747872
|
|
|
|
key: test_jcc
|
|
value: [0.41428571 0.3943662 0.4109589 0.38888889 0.42028986 0.41666667
|
|
0.4084507 0.4057971 0.4084507 0.46376812]
|
|
|
|
mean value: 0.41319228520484297
|
|
|
|
key: train_jcc
|
|
value: [0.45253165 0.46203554 0.4511041 0.4576 0.4503937 0.45054945
|
|
0.45125786 0.45468998 0.45324881 0.44617785]
|
|
|
|
mean value: 0.4529588943309634
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02691889 0.03458738 0.03361678 0.04719758 0.03613091 0.04117703
|
|
0.0362668 0.04120827 0.04111934 0.03775907]
|
|
|
|
mean value: 0.03759820461273193
|
|
|
|
key: score_time
|
|
value: [0.02929235 0.02923179 0.02925587 0.03111696 0.01960897 0.0196991
|
|
0.01907682 0.01908278 0.01912689 0.01908231]
|
|
|
|
mean value: 0.02345738410949707
|
|
|
|
key: test_mcc
|
|
value: [0.57868822 0.38833971 0.46022727 0.46022727 0.59365605 0.5716838
|
|
0.61965619 0.59080018 0.59800506 0.61631563]
|
|
|
|
mean value: 0.5477599382068561
|
|
|
|
key: train_mcc
|
|
value: [0.70478986 0.71257893 0.7093521 0.71453678 0.68855355 0.70348941
|
|
0.69287079 0.73607702 0.69767103 0.72858931]
|
|
|
|
mean value: 0.7088508784572793
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.69736842 0.73684211 0.73684211 0.80263158 0.78666667
|
|
0.81333333 0.78666667 0.8 0.81333333]
|
|
|
|
mean value: 0.7763157894736842
|
|
|
|
key: train_accuracy
|
|
value: [0.85419735 0.85861561 0.85714286 0.86008837 0.84683358 0.85441176
|
|
0.84852941 0.87058824 0.85147059 0.86617647]
|
|
|
|
mean value: 0.8568054232002079
|
|
|
|
key: test_fscore
|
|
value: [0.76470588 0.65671642 0.6875 0.6875 0.76190476 0.75757576
|
|
0.78125 0.77777778 0.7761194 0.77419355]
|
|
|
|
mean value: 0.7425243548893857
|
|
|
|
key: train_fscore
|
|
value: [0.83248731 0.83617747 0.83418803 0.83648881 0.8225256 0.83076923
|
|
0.82571912 0.84879725 0.82735043 0.84550085]
|
|
|
|
mean value: 0.8340004105908049
|
|
|
|
key: test_precision
|
|
value: [0.72222222 0.62857143 0.6875 0.6875 0.77419355 0.71428571
|
|
0.75757576 0.7 0.74285714 0.8 ]
|
|
|
|
mean value: 0.7214705813899362
|
|
|
|
key: train_precision
|
|
value: [0.80655738 0.81666667 0.81605351 0.82372881 0.80333333 0.81543624
|
|
0.80263158 0.83445946 0.80936455 0.82178218]
|
|
|
|
mean value: 0.8150013709044559
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.6875 0.6875 0.6875 0.75 0.80645161
|
|
0.80645161 0.875 0.8125 0.75 ]
|
|
|
|
mean value: 0.7675403225806452
|
|
|
|
key: train_recall
|
|
value: [0.86013986 0.85664336 0.85314685 0.84965035 0.84265734 0.8466899
|
|
0.85017422 0.86363636 0.84615385 0.87062937]
|
|
|
|
mean value: 0.8539521454155601
|
|
|
|
key: test_roc_auc
|
|
value: [0.79261364 0.69602273 0.73011364 0.73011364 0.79545455 0.78958944
|
|
0.81231672 0.79796512 0.80159884 0.80523256]
|
|
|
|
mean value: 0.775102085180386
|
|
|
|
key: train_roc_auc
|
|
value: [0.85500632 0.85834712 0.85659887 0.85866741 0.84626506 0.85337039
|
|
0.84875123 0.86963544 0.8507419 0.86678677]
|
|
|
|
mean value: 0.8564170512536526
|
|
|
|
key: test_jcc
|
|
value: [0.61904762 0.48888889 0.52380952 0.52380952 0.61538462 0.6097561
|
|
0.64102564 0.63636364 0.63414634 0.63157895]
|
|
|
|
mean value: 0.592381083472226
|
|
|
|
key: train_jcc
|
|
value: [0.71304348 0.71847507 0.71554252 0.71893491 0.69855072 0.71052632
|
|
0.70317003 0.73731343 0.70553936 0.73235294]
|
|
|
|
mean value: 0.7153448786669865
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.3526473 0.30884123 0.30870199 0.30832839 0.31096816 0.32195187
|
|
0.43453765 0.25615191 0.27564645 0.28402328]
|
|
|
|
mean value: 0.31617982387542726
|
|
|
|
key: score_time
|
|
value: [0.01904249 0.01905918 0.01906228 0.01901865 0.02059817 0.01903939
|
|
0.02135944 0.02266073 0.01898694 0.01897216]
|
|
|
|
mean value: 0.01977994441986084
|
|
|
|
key: test_mcc
|
|
value: [0.57868822 0.38833971 0.46022727 0.48519965 0.54874089 0.5716838
|
|
0.61965619 0.56896508 0.59800506 0.61631563]
|
|
|
|
mean value: 0.5435821511452588
|
|
|
|
key: train_mcc
|
|
value: [0.70478986 0.71257893 0.7093521 0.7393944 0.75183558 0.70348941
|
|
0.69287079 0.7581098 0.69767103 0.72858931]
|
|
|
|
mean value: 0.7198681212311817
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.69736842 0.73684211 0.75 0.77631579 0.78666667
|
|
0.81333333 0.77333333 0.8 0.81333333]
|
|
|
|
mean value: 0.7736666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.85419735 0.85861561 0.85714286 0.8718704 0.87776141 0.85441176
|
|
0.84852941 0.88088235 0.85147059 0.86617647]
|
|
|
|
mean value: 0.8621058217101273
|
|
|
|
key: test_fscore
|
|
value: [0.76470588 0.65671642 0.6875 0.6984127 0.74626866 0.75757576
|
|
0.78125 0.76712329 0.7761194 0.77419355]
|
|
|
|
mean value: 0.7409865652011667
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:115: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:118: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.83248731 0.83617747 0.83418803 0.85128205 0.85860307 0.83076923
|
|
0.82571912 0.86201022 0.82735043 0.84550085]
|
|
|
|
mean value: 0.8404087784573542
|
|
|
|
key: test_precision
|
|
value: [0.72222222 0.62857143 0.6875 0.70967742 0.71428571 0.71428571
|
|
0.75757576 0.68292683 0.74285714 0.8 ]
|
|
|
|
mean value: 0.7159902228421111
|
|
|
|
key: train_precision
|
|
value: [0.80655738 0.81666667 0.81605351 0.83277592 0.8372093 0.81543624
|
|
0.80263158 0.84053156 0.80936455 0.82178218]
|
|
|
|
mean value: 0.8199008886212261
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.6875 0.6875 0.6875 0.78125 0.80645161
|
|
0.80645161 0.875 0.8125 0.75 ]
|
|
|
|
mean value: 0.7706653225806451
|
|
|
|
key: train_recall
|
|
value: [0.86013986 0.85664336 0.85314685 0.87062937 0.88111888 0.8466899
|
|
0.85017422 0.88461538 0.84615385 0.87062937]
|
|
|
|
mean value: 0.861994103457518
|
|
|
|
key: test_roc_auc
|
|
value: [0.79261364 0.69602273 0.73011364 0.74147727 0.77698864 0.78958944
|
|
0.81231672 0.78633721 0.80159884 0.80523256]
|
|
|
|
mean value: 0.7732290672099843
|
|
|
|
key: train_roc_auc
|
|
value: [0.85500632 0.85834712 0.85659887 0.87170145 0.87821847 0.85337039
|
|
0.84875123 0.88139399 0.8507419 0.86678677]
|
|
|
|
mean value: 0.8620916513851831
|
|
|
|
key: test_jcc
|
|
value: [0.61904762 0.48888889 0.52380952 0.53658537 0.5952381 0.6097561
|
|
0.64102564 0.62222222 0.63414634 0.63157895]
|
|
|
|
mean value: 0.590229874247846
|
|
|
|
key: train_jcc
|
|
value: [0.71304348 0.71847507 0.71554252 0.74107143 0.75223881 0.71052632
|
|
0.70317003 0.75748503 0.70553936 0.73235294]
|
|
|
|
mean value: 0.7249444982435457
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03952503 0.04458117 0.04192305 0.04709601 0.0482595 0.04177332
|
|
0.04183102 0.04927802 0.04940033 0.07689357]
|
|
|
|
mean value: 0.04805610179901123
|
|
|
|
key: score_time
|
|
value: [0.01453161 0.01288486 0.01447868 0.01298976 0.01239872 0.01299834
|
|
0.01297569 0.0130744 0.01301527 0.0146513 ]
|
|
|
|
mean value: 0.013399863243103027
|
|
|
|
key: test_mcc
|
|
value: [0.71976336 0.54545455 0.43192975 0.57188626 0.7472238 0.63521
|
|
0.60331932 0.57461562 0.63213531 0.67866682]
|
|
|
|
mean value: 0.6140204775170993
|
|
|
|
key: train_mcc
|
|
value: [0.71710512 0.68945784 0.70905448 0.70931293 0.66941284 0.68501726
|
|
0.68613192 0.68932352 0.69416521 0.69122274]
|
|
|
|
mean value: 0.6940203873531311
|
|
|
|
key: test_accuracy
|
|
value: [0.85227273 0.77272727 0.71590909 0.78409091 0.87356322 0.81609195
|
|
0.79310345 0.7816092 0.81609195 0.83908046]
|
|
|
|
mean value: 0.8044540229885058
|
|
|
|
key: train_accuracy
|
|
value: [0.85750636 0.84351145 0.85368957 0.85368957 0.83354511 0.841169
|
|
0.84243964 0.84371029 0.84625159 0.84371029]
|
|
|
|
mean value: 0.8459222867784708
|
|
|
|
key: test_fscore
|
|
value: [0.86597938 0.77272727 0.71910112 0.79569892 0.87058824 0.82222222
|
|
0.8125 0.80412371 0.81818182 0.84444444]
|
|
|
|
mean value: 0.8125567133980068
|
|
|
|
key: train_fscore
|
|
value: [0.8627451 0.84981685 0.85854859 0.85889571 0.84043849 0.84811665
|
|
0.84729064 0.84907975 0.85116851 0.85126965]
|
|
|
|
mean value: 0.8517369930941097
|
|
|
|
key: test_precision
|
|
value: [0.79245283 0.77272727 0.71111111 0.75510204 0.88095238 0.78723404
|
|
0.73584906 0.73584906 0.81818182 0.82608696]
|
|
|
|
mean value: 0.7815546566260066
|
|
|
|
key: train_precision
|
|
value: [0.8321513 0.81690141 0.83095238 0.82938389 0.80796253 0.81351981
|
|
0.82296651 0.81990521 0.82380952 0.81105991]
|
|
|
|
mean value: 0.8208612470780035
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.77272727 0.72727273 0.84090909 0.86046512 0.86046512
|
|
0.90697674 0.88636364 0.81818182 0.86363636]
|
|
|
|
mean value: 0.849154334038055
|
|
|
|
key: train_recall
|
|
value: [0.8956743 0.88549618 0.88804071 0.89058524 0.87563452 0.8857868
|
|
0.87309645 0.88040712 0.88040712 0.8956743 ]
|
|
|
|
mean value: 0.8850802753774816
|
|
|
|
key: test_roc_auc
|
|
value: [0.85227273 0.77272727 0.71590909 0.78409091 0.87341438 0.81659619
|
|
0.79439746 0.78039112 0.81606765 0.83879493]
|
|
|
|
mean value: 0.8044661733615223
|
|
|
|
key: train_roc_auc
|
|
value: [0.85750636 0.84351145 0.85368957 0.85368957 0.83349156 0.84111223
|
|
0.84240064 0.84375686 0.84629493 0.84377624]
|
|
|
|
mean value: 0.8459229408041746
|
|
|
|
key: test_jcc
|
|
value: [0.76363636 0.62962963 0.56140351 0.66071429 0.77083333 0.69811321
|
|
0.68421053 0.67241379 0.69230769 0.73076923]
|
|
|
|
mean value: 0.6864031571128872
|
|
|
|
key: train_jcc
|
|
value: [0.75862069 0.7388535 0.75215517 0.75268817 0.72478992 0.73628692
|
|
0.73504274 0.73773987 0.74089936 0.74105263]
|
|
|
|
mean value: 0.7418128969385925
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.98496866 0.95473599 1.09272647 0.97494578 1.13974237 1.4481535
|
|
1.49764705 1.4215436 1.07872605 1.23035407]
|
|
|
|
mean value: 1.1823543548583983
|
|
|
|
key: score_time
|
|
value: [0.01475596 0.01501179 0.0150342 0.01500845 0.0150547 0.01534939
|
|
0.01498508 0.01506901 0.01850271 0.01581573]
|
|
|
|
mean value: 0.01545870304107666
|
|
|
|
key: test_mcc
|
|
value: [0.66759342 0.66062825 0.50051733 0.5933661 0.77102073 0.65696218
|
|
0.81702814 0.61269937 0.60940803 0.63444041]
|
|
|
|
mean value: 0.6523663980572162
|
|
|
|
key: train_mcc
|
|
value: [0.79768851 0.81031799 0.79792694 0.81520262 0.81278376 0.81320255
|
|
0.78072794 0.82034342 0.82365593 0.81322894]
|
|
|
|
mean value: 0.8085078601766735
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.82954545 0.75 0.79545455 0.88505747 0.82758621
|
|
0.90804598 0.8045977 0.8045977 0.81609195]
|
|
|
|
mean value: 0.8250522466039707
|
|
|
|
key: train_accuracy
|
|
value: [0.89821883 0.90458015 0.89821883 0.90712468 0.90597205 0.90597205
|
|
0.88945362 0.90978399 0.91105464 0.90597205]
|
|
|
|
mean value: 0.9036350879915678
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.83516484 0.74418605 0.80434783 0.88636364 0.83146067
|
|
0.90909091 0.8172043 0.8045977 0.82608696]
|
|
|
|
mean value: 0.8300608149279597
|
|
|
|
key: train_fscore
|
|
value: [0.9009901 0.9070632 0.90123457 0.90931677 0.90818859 0.90864198
|
|
0.89325153 0.91158157 0.91358025 0.90841584]
|
|
|
|
mean value: 0.9062264386395962
|
|
|
|
key: test_precision
|
|
value: [0.78431373 0.80851064 0.76190476 0.77083333 0.86666667 0.80434783
|
|
0.88888889 0.7755102 0.81395349 0.79166667]
|
|
|
|
mean value: 0.8066596199789068
|
|
|
|
key: train_precision
|
|
value: [0.87710843 0.88405797 0.87529976 0.88834951 0.88834951 0.88461538
|
|
0.86460808 0.89268293 0.88729017 0.88433735]
|
|
|
|
mean value: 0.8826699098784945
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.86363636 0.72727273 0.84090909 0.90697674 0.86046512
|
|
0.93023256 0.86363636 0.79545455 0.86363636]
|
|
|
|
mean value: 0.8561310782241015
|
|
|
|
key: train_recall
|
|
value: [0.92620865 0.93129771 0.92875318 0.93129771 0.92893401 0.93401015
|
|
0.92385787 0.93129771 0.94147583 0.93384224]
|
|
|
|
mean value: 0.9310975058446674
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.82954545 0.75 0.79545455 0.88530655 0.82795983
|
|
0.9082981 0.80391121 0.80470402 0.81553911]
|
|
|
|
mean value: 0.8250264270613108
|
|
|
|
key: train_roc_auc
|
|
value: [0.89821883 0.90458015 0.89821883 0.90712468 0.90594283 0.90593637
|
|
0.88940985 0.90981129 0.91109324 0.90600741]
|
|
|
|
mean value: 0.903634349853399
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.71698113 0.59259259 0.67272727 0.79591837 0.71153846
|
|
0.83333333 0.69090909 0.67307692 0.7037037 ]
|
|
|
|
mean value: 0.7118053604576515
|
|
|
|
key: train_jcc
|
|
value: [0.81981982 0.82993197 0.82022472 0.83371298 0.83181818 0.83257919
|
|
0.80709534 0.8375286 0.84090909 0.83219955]
|
|
|
|
mean value: 0.8285819448297327
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01602221 0.01385975 0.01371217 0.0158987 0.01669502 0.01303077
|
|
0.01360488 0.01454973 0.01411724 0.01296067]
|
|
|
|
mean value: 0.014445114135742187
|
|
|
|
key: score_time
|
|
value: [0.01249313 0.01088738 0.01216459 0.0105288 0.00993156 0.01178885
|
|
0.01053643 0.01033878 0.01024318 0.01122117]
|
|
|
|
mean value: 0.011013388633728027
|
|
|
|
key: test_mcc
|
|
value: [0.28767798 0.40951418 0.29926602 0.56832862 0.60920157 0.51879367
|
|
0.51879367 0.24188306 0.42547569 0.35627361]
|
|
|
|
mean value: 0.4235208045282095
|
|
|
|
key: train_mcc
|
|
value: [0.41486656 0.49201246 0.46947331 0.46683522 0.46468129 0.45990325
|
|
0.47404657 0.47833815 0.46483462 0.50586826]
|
|
|
|
mean value: 0.46908596855294155
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.70454545 0.64772727 0.78409091 0.8045977 0.75862069
|
|
0.75862069 0.62068966 0.71264368 0.67816092]
|
|
|
|
mean value: 0.7106060606060606
|
|
|
|
key: train_accuracy
|
|
value: [0.69974555 0.74554707 0.73409669 0.73282443 0.73189327 0.72935197
|
|
0.73570521 0.73824651 0.73189327 0.75222363]
|
|
|
|
mean value: 0.7331527590521547
|
|
|
|
key: test_fscore
|
|
value: [0.68627451 0.71111111 0.67368421 0.78651685 0.8 0.76404494
|
|
0.76404494 0.64516129 0.71264368 0.68888889]
|
|
|
|
mean value: 0.7232370430386771
|
|
|
|
key: train_fscore
|
|
value: [0.73542601 0.75308642 0.74355828 0.74201474 0.7404674 0.73929009
|
|
0.74939759 0.74878049 0.72200264 0.7607362 ]
|
|
|
|
mean value: 0.7434759852829844
|
|
|
|
key: test_precision
|
|
value: [0.60344828 0.69565217 0.62745098 0.77777778 0.80952381 0.73913043
|
|
0.73913043 0.6122449 0.72093023 0.67391304]
|
|
|
|
mean value: 0.6999202061029658
|
|
|
|
key: train_precision
|
|
value: [0.65731463 0.73141487 0.71800948 0.71733967 0.71837709 0.71394799
|
|
0.71330275 0.71896956 0.74863388 0.73459716]
|
|
|
|
mean value: 0.7171907065852907
|
|
|
|
key: test_recall
|
|
value: [0.79545455 0.72727273 0.72727273 0.79545455 0.79069767 0.79069767
|
|
0.79069767 0.68181818 0.70454545 0.70454545]
|
|
|
|
mean value: 0.750845665961945
|
|
|
|
key: train_recall
|
|
value: [0.8346056 0.77608142 0.77099237 0.76844784 0.76395939 0.76649746
|
|
0.7893401 0.78117048 0.69720102 0.78880407]
|
|
|
|
mean value: 0.7737099753296909
|
|
|
|
key: test_roc_auc
|
|
value: [0.63636364 0.70454545 0.64772727 0.78409091 0.80443975 0.7589852
|
|
0.7589852 0.61997886 0.71273784 0.67785412]
|
|
|
|
mean value: 0.7105708245243129
|
|
|
|
key: train_roc_auc
|
|
value: [0.69974555 0.74554707 0.73409669 0.73282443 0.73185247 0.72930471
|
|
0.73563697 0.73830098 0.73184924 0.75227006]
|
|
|
|
mean value: 0.7331428165484817
|
|
|
|
key: test_jcc
|
|
value: [0.52238806 0.55172414 0.50793651 0.64814815 0.66666667 0.61818182
|
|
0.61818182 0.47619048 0.55357143 0.52542373]
|
|
|
|
mean value: 0.568841279032295
|
|
|
|
key: train_jcc
|
|
value: [0.58156028 0.6039604 0.59179688 0.58984375 0.58789062 0.58640777
|
|
0.59922929 0.59844055 0.56494845 0.61386139]
|
|
|
|
mean value: 0.5917939369364226
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01541185 0.01635194 0.01640701 0.01618457 0.01612496 0.01588988
|
|
0.02039766 0.01608372 0.01671124 0.02363396]
|
|
|
|
mean value: 0.017319679260253906
|
|
|
|
key: score_time
|
|
value: [0.01239347 0.01237416 0.01222992 0.0126605 0.01228976 0.01222897
|
|
0.01241994 0.01231122 0.01263213 0.01260352]
|
|
|
|
mean value: 0.012414360046386718
|
|
|
|
key: test_mcc
|
|
value: [0.68252363 0.36514837 0.34530694 0.52613536 0.61090601 0.3853797
|
|
0.52126134 0.40330006 0.54295079 0.24714945]
|
|
|
|
mean value: 0.46300616567387165
|
|
|
|
key: train_mcc
|
|
value: [0.47428882 0.51540005 0.50501003 0.4893689 0.47802164 0.49540494
|
|
0.48842804 0.50058709 0.49168322 0.49679032]
|
|
|
|
mean value: 0.4934983053216603
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.68181818 0.67045455 0.76136364 0.8045977 0.68965517
|
|
0.75862069 0.70114943 0.77011494 0.62068966]
|
|
|
|
mean value: 0.729937304075235
|
|
|
|
key: train_accuracy
|
|
value: [0.73536896 0.75699746 0.7519084 0.74300254 0.73824651 0.74714104
|
|
0.7433291 0.74968234 0.74459975 0.74714104]
|
|
|
|
mean value: 0.7457417124972922
|
|
|
|
key: test_fscore
|
|
value: [0.84444444 0.69565217 0.69473684 0.77419355 0.80898876 0.70967742
|
|
0.76923077 0.7173913 0.76190476 0.66666667]
|
|
|
|
mean value: 0.7442886694399654
|
|
|
|
key: train_fscore
|
|
value: [0.75059952 0.76564417 0.7601476 0.75721154 0.74878049 0.75582822
|
|
0.75425791 0.75768758 0.75636364 0.75878788]
|
|
|
|
mean value: 0.7565308540334024
|
|
|
|
key: test_precision
|
|
value: [0.82608696 0.66666667 0.64705882 0.73469388 0.7826087 0.66
|
|
0.72916667 0.6875 0.8 0.6 ]
|
|
|
|
mean value: 0.7133781686587679
|
|
|
|
key: train_precision
|
|
value: [0.70975057 0.73933649 0.73571429 0.71753986 0.72065728 0.73159145
|
|
0.72429907 0.73333333 0.72222222 0.72453704]
|
|
|
|
mean value: 0.725898159276402
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.72727273 0.75 0.81818182 0.8372093 0.76744186
|
|
0.81395349 0.75 0.72727273 0.75 ]
|
|
|
|
mean value: 0.7804968287526427
|
|
|
|
key: train_recall
|
|
value: [0.79643766 0.79389313 0.78625954 0.80152672 0.77918782 0.78172589
|
|
0.78680203 0.78371501 0.79389313 0.79643766]
|
|
|
|
mean value: 0.7899878585913382
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.68181818 0.67045455 0.76136364 0.80496829 0.69053911
|
|
0.75924947 0.7005814 0.77061311 0.61918605]
|
|
|
|
mean value: 0.7299682875264271
|
|
|
|
key: train_roc_auc
|
|
value: [0.73536896 0.75699746 0.7519084 0.74300254 0.73819442 0.74709704
|
|
0.74327379 0.74972553 0.7446623 0.7472036 ]
|
|
|
|
mean value: 0.7457434029526873
|
|
|
|
key: test_jcc
|
|
value: [0.73076923 0.53333333 0.53225806 0.63157895 0.67924528 0.55
|
|
0.625 0.55932203 0.61538462 0.5 ]
|
|
|
|
mean value: 0.5956891508288903
|
|
|
|
key: train_jcc
|
|
value: [0.60076775 0.62027833 0.61309524 0.60928433 0.59844055 0.60749507
|
|
0.60546875 0.60990099 0.60818713 0.61132812]
|
|
|
|
mean value: 0.6084246269566757
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01486754 0.01294136 0.01227379 0.01070833 0.01139832 0.01107717
|
|
0.01244855 0.01302791 0.01268315 0.01264668]
|
|
|
|
mean value: 0.012407279014587403
|
|
|
|
key: score_time
|
|
value: [0.04071927 0.02211118 0.01623821 0.01385784 0.01441145 0.01481605
|
|
0.01512694 0.01559448 0.01551914 0.01509857]
|
|
|
|
mean value: 0.018349313735961915
|
|
|
|
key: test_mcc
|
|
value: [0.54772256 0.54601891 0.38646346 0.54601891 0.56748941 0.58327727
|
|
0.35695404 0.42729122 0.35695404 0.42729122]
|
|
|
|
mean value: 0.47454810311174817
|
|
|
|
key: train_mcc
|
|
value: [0.63569864 0.64349398 0.60969189 0.61419466 0.63993928 0.62666052
|
|
0.65652237 0.62817177 0.65053867 0.63436197]
|
|
|
|
mean value: 0.633927376906168
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.77272727 0.69318182 0.77272727 0.7816092 0.7816092
|
|
0.67816092 0.71264368 0.67816092 0.71264368]
|
|
|
|
mean value: 0.7356191222570533
|
|
|
|
key: train_accuracy
|
|
value: [0.81679389 0.82061069 0.80407125 0.80661578 0.81956798 0.81321474
|
|
0.82719187 0.81321474 0.82465057 0.81702668]
|
|
|
|
mean value: 0.8162958185010233
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.77777778 0.69662921 0.77777778 0.79120879 0.80412371
|
|
0.68181818 0.7311828 0.6744186 0.7311828 ]
|
|
|
|
mean value: 0.7448728345107067
|
|
|
|
key: train_fscore
|
|
value: [0.82396088 0.82783883 0.81081081 0.81188119 0.82425743 0.81602003
|
|
0.83414634 0.8196319 0.82962963 0.81954887]
|
|
|
|
mean value: 0.8217725902851899
|
|
|
|
key: test_precision
|
|
value: [0.75 0.76086957 0.68888889 0.76086957 0.75 0.72222222
|
|
0.66666667 0.69387755 0.69047619 0.69387755]
|
|
|
|
mean value: 0.7177748200729567
|
|
|
|
key: train_precision
|
|
value: [0.79294118 0.79577465 0.78384798 0.79036145 0.80434783 0.80493827
|
|
0.8028169 0.79146919 0.8057554 0.80740741]
|
|
|
|
mean value: 0.7979660247642671
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.79545455 0.70454545 0.79545455 0.8372093 0.90697674
|
|
0.69767442 0.77272727 0.65909091 0.77272727]
|
|
|
|
mean value: 0.7760042283298098
|
|
|
|
key: train_recall
|
|
value: [0.85750636 0.86259542 0.83969466 0.8346056 0.84517766 0.82741117
|
|
0.8680203 0.84987277 0.85496183 0.83206107]
|
|
|
|
mean value: 0.8471906846979502
|
|
|
|
key: test_roc_auc
|
|
value: [0.77272727 0.77272727 0.69318182 0.77272727 0.78224101 0.78303383
|
|
0.67838266 0.71194503 0.67838266 0.71194503]
|
|
|
|
mean value: 0.7357293868921776
|
|
|
|
key: train_roc_auc
|
|
value: [0.81679389 0.82061069 0.80407125 0.80661578 0.8195354 0.81319668
|
|
0.82713992 0.81326126 0.82468904 0.81704576]
|
|
|
|
mean value: 0.816295966210718
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.63636364 0.53448276 0.63636364 0.65454545 0.67241379
|
|
0.51724138 0.57627119 0.50877193 0.57627119]
|
|
|
|
mean value: 0.595558210387027
|
|
|
|
key: train_jcc
|
|
value: [0.7006237 0.70625 0.68181818 0.68333333 0.70105263 0.68921776
|
|
0.71548117 0.69438669 0.70886076 0.69426752]
|
|
|
|
mean value: 0.6975291747691413
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04696131 0.05472875 0.05334735 0.04616427 0.05575418 0.04985499
|
|
0.05442619 0.05384159 0.04604459 0.05362082]
|
|
|
|
mean value: 0.05147440433502197
|
|
|
|
key: score_time
|
|
value: [0.01842785 0.01855063 0.01799917 0.01824617 0.01761675 0.01863289
|
|
0.01906061 0.01764941 0.01791215 0.0178628 ]
|
|
|
|
mean value: 0.018195843696594237
|
|
|
|
key: test_mcc
|
|
value: [0.67124862 0.48038446 0.52286233 0.5547002 0.70137421 0.58908039
|
|
0.5504913 0.52749822 0.60940803 0.50171077]
|
|
|
|
mean value: 0.5708758522241896
|
|
|
|
key: train_mcc
|
|
value: [0.67561944 0.69521698 0.67661176 0.68605278 0.66161012 0.66633066
|
|
0.66558603 0.66092901 0.66447916 0.65813246]
|
|
|
|
mean value: 0.6710568396829444
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.73863636 0.76136364 0.77272727 0.85057471 0.79310345
|
|
0.77011494 0.75862069 0.8045977 0.74712644]
|
|
|
|
mean value: 0.7815047021943573
|
|
|
|
key: train_accuracy
|
|
value: [0.8346056 0.84478372 0.83587786 0.84096692 0.82846252 0.83100381
|
|
0.83100381 0.82846252 0.82973316 0.82465057]
|
|
|
|
mean value: 0.8329550488051706
|
|
|
|
key: test_fscore
|
|
value: [0.84313725 0.75268817 0.76404494 0.79166667 0.85057471 0.8
|
|
0.78723404 0.78350515 0.8045977 0.77083333]
|
|
|
|
mean value: 0.7948281981750667
|
|
|
|
key: train_fscore
|
|
value: [0.8452381 0.85406699 0.84513806 0.84921592 0.83832335 0.84033613
|
|
0.83956574 0.8371532 0.83932854 0.83764706]
|
|
|
|
mean value: 0.8426013081125754
|
|
|
|
key: test_precision
|
|
value: [0.74137931 0.71428571 0.75555556 0.73076923 0.84090909 0.76595745
|
|
0.7254902 0.71698113 0.81395349 0.71153846]
|
|
|
|
mean value: 0.7516819626737388
|
|
|
|
key: train_precision
|
|
value: [0.79418345 0.80586907 0.8 0.80733945 0.79365079 0.79726651
|
|
0.8 0.79587156 0.79365079 0.77899344]
|
|
|
|
mean value: 0.7966825066413111
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.79545455 0.77272727 0.86363636 0.86046512 0.8372093
|
|
0.86046512 0.86363636 0.79545455 0.84090909]
|
|
|
|
mean value: 0.846723044397463
|
|
|
|
key: train_recall
|
|
value: [0.90330789 0.90839695 0.8956743 0.8956743 0.88832487 0.88832487
|
|
0.88324873 0.88295165 0.89058524 0.90585242]
|
|
|
|
mean value: 0.8942341225248963
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.73863636 0.76136364 0.77272727 0.8506871 0.79360465
|
|
0.77114165 0.75739958 0.80470402 0.74603594]
|
|
|
|
mean value: 0.7814482029598309
|
|
|
|
key: train_roc_auc
|
|
value: [0.8346056 0.84478372 0.83587786 0.84096692 0.82838636 0.83093088
|
|
0.83093734 0.82853166 0.82981039 0.82475362]
|
|
|
|
mean value: 0.8329584350499218
|
|
|
|
key: test_jcc
|
|
value: [0.72881356 0.60344828 0.61818182 0.65517241 0.74 0.66666667
|
|
0.64912281 0.6440678 0.67307692 0.62711864]
|
|
|
|
mean value: 0.6605668904598124
|
|
|
|
key: train_jcc
|
|
value: [0.73195876 0.74530271 0.73180873 0.73794549 0.72164948 0.72463768
|
|
0.72349272 0.71991701 0.7231405 0.72064777]
|
|
|
|
mean value: 0.7280500872128758
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.06535935 2.9703083 3.16897154 1.63516951 3.05120897 2.99582481
|
|
3.054533 3.10651636 3.26628566 2.91335678]
|
|
|
|
mean value: 2.822753429412842
|
|
|
|
key: score_time
|
|
value: [0.01267791 0.01468849 0.01453876 0.0129478 0.014781 0.01487565
|
|
0.01514196 0.01463127 0.01515675 0.01527548]
|
|
|
|
mean value: 0.01447150707244873
|
|
|
|
key: test_mcc
|
|
value: [0.68070616 0.5 0.43463356 0.52394654 0.70137421 0.56484984
|
|
0.51744186 0.47145877 0.54198427 0.58821234]
|
|
|
|
mean value: 0.5524607560393185
|
|
|
|
key: train_mcc
|
|
value: [0.87500374 0.95173097 0.9417228 0.87295442 0.93960096 0.94162706
|
|
0.93647031 0.97972048 0.91900117 0.93920448]
|
|
|
|
mean value: 0.9297036393442298
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.75 0.71590909 0.76136364 0.85057471 0.7816092
|
|
0.75862069 0.73563218 0.77011494 0.79310345]
|
|
|
|
mean value: 0.7746473354231975
|
|
|
|
key: train_accuracy
|
|
value: [0.9351145 0.97582697 0.97073791 0.93638677 0.96950445 0.9707751
|
|
0.9682338 0.98983482 0.95806861 0.96950445]
|
|
|
|
mean value: 0.9643987377582923
|
|
|
|
key: test_fscore
|
|
value: [0.84848485 0.75 0.69879518 0.75294118 0.85057471 0.78651685
|
|
0.75862069 0.73563218 0.7826087 0.80434783]
|
|
|
|
mean value: 0.7768522167556939
|
|
|
|
key: train_fscore
|
|
value: [0.93833132 0.97567222 0.97039897 0.93573265 0.97007481 0.9706258
|
|
0.96831432 0.98987342 0.95960832 0.9697733 ]
|
|
|
|
mean value: 0.9648405125048765
|
|
|
|
key: test_precision
|
|
value: [0.76363636 0.75 0.74358974 0.7804878 0.84090909 0.76086957
|
|
0.75 0.74418605 0.75 0.77083333]
|
|
|
|
mean value: 0.76545119480756
|
|
|
|
key: train_precision
|
|
value: [0.89400922 0.98195876 0.98177083 0.94545455 0.95343137 0.97686375
|
|
0.96708861 0.98488665 0.9245283 0.96009975]
|
|
|
|
mean value: 0.9570091794005952
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.75 0.65909091 0.72727273 0.86046512 0.81395349
|
|
0.76744186 0.72727273 0.81818182 0.84090909]
|
|
|
|
mean value: 0.7919133192389006
|
|
|
|
key: train_recall
|
|
value: [0.98727735 0.96946565 0.95928753 0.92620865 0.98730964 0.96446701
|
|
0.96954315 0.99491094 0.99745547 0.97964377]
|
|
|
|
mean value: 0.9735569160822
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.75 0.71590909 0.76136364 0.8506871 0.78197674
|
|
0.75872093 0.73572939 0.76955603 0.79254757]
|
|
|
|
mean value: 0.7746035940803383
|
|
|
|
key: train_roc_auc
|
|
value: [0.9351145 0.97582697 0.97073791 0.93638677 0.96948179 0.97078312
|
|
0.96823213 0.98984126 0.9581186 0.96951731]
|
|
|
|
mean value: 0.9644040376641996
|
|
|
|
key: test_jcc
|
|
value: [0.73684211 0.6 0.53703704 0.60377358 0.74 0.64814815
|
|
0.61111111 0.58181818 0.64285714 0.67272727]
|
|
|
|
mean value: 0.6374314583867712
|
|
|
|
key: train_jcc
|
|
value: [0.88382688 0.9525 0.9425 0.87922705 0.94188862 0.94292804
|
|
0.93857494 0.97994987 0.92235294 0.94132029]
|
|
|
|
mean value: 0.9325068639804781
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06067634 0.05294609 0.04533696 0.05182934 0.04592848 0.05230975
|
|
0.04506969 0.04694581 0.05104733 0.05060101]
|
|
|
|
mean value: 0.05026907920837402
|
|
|
|
key: score_time
|
|
value: [0.00971699 0.00921226 0.00920987 0.00925636 0.00938106 0.00934958
|
|
0.00928903 0.00924182 0.00931191 0.00933528]
|
|
|
|
mean value: 0.009330415725708007
|
|
|
|
key: test_mcc
|
|
value: [0.86452993 0.68181818 0.75174939 0.79730996 0.72410148 0.4957562
|
|
0.65520898 0.65641902 0.84118687 0.79323121]
|
|
|
|
mean value: 0.7261311211803974
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.84090909 0.875 0.89772727 0.86206897 0.74712644
|
|
0.82758621 0.82758621 0.91954023 0.89655172]
|
|
|
|
mean value: 0.8625914315569487
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93023256 0.84090909 0.87912088 0.9010989 0.86046512 0.73170732
|
|
0.82352941 0.83516484 0.91764706 0.8988764 ]
|
|
|
|
mean value: 0.8618751572868099
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95238095 0.84090909 0.85106383 0.87234043 0.86046512 0.76923077
|
|
0.83333333 0.80851064 0.95121951 0.88888889]
|
|
|
|
mean value: 0.8628342556834248
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.84090909 0.90909091 0.93181818 0.86046512 0.69767442
|
|
0.81395349 0.86363636 0.88636364 0.90909091]
|
|
|
|
mean value: 0.8622093023255814
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93181818 0.84090909 0.875 0.89772727 0.86205074 0.74656448
|
|
0.82743129 0.82716702 0.919926 0.89640592]
|
|
|
|
mean value: 0.8625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86956522 0.7254902 0.78431373 0.82 0.75510204 0.57692308
|
|
0.7 0.71698113 0.84782609 0.81632653]
|
|
|
|
mean value: 0.7612528006343573
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17999911 0.17724442 0.17448497 0.17866874 0.17901969 0.17455459
|
|
0.1772244 0.1795907 0.18107581 0.17981648]
|
|
|
|
mean value: 0.17816789150238038
|
|
|
|
key: score_time
|
|
value: [0.02069139 0.01987624 0.02035975 0.01991272 0.01992059 0.0199163
|
|
0.02066851 0.019912 0.02007174 0.02016401]
|
|
|
|
mean value: 0.02014932632446289
|
|
|
|
key: test_mcc
|
|
value: [0.73029674 0.56950711 0.54772256 0.63702206 0.63213531 0.58908039
|
|
0.61090601 0.54198427 0.61371748 0.59116498]
|
|
|
|
mean value: 0.6063536913398097
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.86363636 0.78409091 0.77272727 0.81818182 0.81609195 0.79310345
|
|
0.8045977 0.77011494 0.8045977 0.79310345]
|
|
|
|
mean value: 0.8020245559038662
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.79120879 0.76190476 0.82222222 0.81395349 0.8
|
|
0.80898876 0.7826087 0.79518072 0.80851064]
|
|
|
|
mean value: 0.8054143301985729
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.76595745 0.8 0.80434783 0.81395349 0.76595745
|
|
0.7826087 0.75 0.84615385 0.76 ]
|
|
|
|
mean value: 0.7922312083215425
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.81818182 0.72727273 0.84090909 0.81395349 0.8372093
|
|
0.8372093 0.81818182 0.75 0.86363636]
|
|
|
|
mean value: 0.8215644820295983
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.86363636 0.78409091 0.77272727 0.81818182 0.81606765 0.79360465
|
|
0.80496829 0.76955603 0.80523256 0.7922833 ]
|
|
|
|
mean value: 0.8020348837209302
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.65454545 0.61538462 0.69811321 0.68627451 0.66666667
|
|
0.67924528 0.64285714 0.66 0.67857143]
|
|
|
|
mean value: 0.6750889077626037
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01222849 0.01173115 0.01190543 0.01236916 0.01223779 0.01243281
|
|
0.01243258 0.01199484 0.01241732 0.01229382]
|
|
|
|
mean value: 0.012204337120056152
|
|
|
|
key: score_time
|
|
value: [0.00952125 0.00915074 0.00955224 0.00985074 0.00988984 0.00982094
|
|
0.0091567 0.00913739 0.00936317 0.00908303]
|
|
|
|
mean value: 0.009452605247497558
|
|
|
|
key: test_mcc
|
|
value: [0.43192975 0.41294832 0.36363636 0.45454545 0.29237545 0.31434142
|
|
0.33456898 0.14917898 0.33315711 0.31094663]
|
|
|
|
mean value: 0.3397628443637556
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71590909 0.70454545 0.68181818 0.72727273 0.64367816 0.65517241
|
|
0.66666667 0.57471264 0.66666667 0.65517241]
|
|
|
|
mean value: 0.6691614420062696
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71264368 0.72340426 0.68181818 0.72727273 0.66666667 0.67391304
|
|
0.6741573 0.60215054 0.6741573 0.65116279]
|
|
|
|
mean value: 0.6787346487789561
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72093023 0.68 0.68181818 0.72727273 0.62 0.63265306
|
|
0.65217391 0.57142857 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6619610020678921
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.70454545 0.77272727 0.68181818 0.72727273 0.72093023 0.72093023
|
|
0.69767442 0.63636364 0.68181818 0.63636364]
|
|
|
|
mean value: 0.6980443974630021
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71590909 0.70454545 0.68181818 0.72727273 0.64455603 0.65591966
|
|
0.66701903 0.57399577 0.66649049 0.65539112]
|
|
|
|
mean value: 0.669291754756871
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55357143 0.56666667 0.51724138 0.57142857 0.5 0.50819672
|
|
0.50847458 0.43076923 0.50847458 0.48275862]
|
|
|
|
mean value: 0.5147581771289745
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.55
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.77516341 2.72674942 2.75992584 2.77896261 2.67283797 2.68284583
|
|
2.66170597 2.65351701 2.66825771 2.68729377]
|
|
|
|
mean value: 2.7067259550094604
|
|
|
|
key: score_time
|
|
value: [0.10390854 0.10599661 0.09895778 0.10612893 0.10672951 0.10571718
|
|
0.10673928 0.10393858 0.09965968 0.10147977]
|
|
|
|
mean value: 0.10392558574676514
|
|
|
|
key: test_mcc
|
|
value: [0.86722738 0.79730996 0.72802521 0.75019377 0.72746922 0.81702814
|
|
0.84118687 0.77077916 0.77359882 0.86289151]
|
|
|
|
mean value: 0.7935710044788394
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.89772727 0.86363636 0.875 0.86206897 0.90804598
|
|
0.91954023 0.88505747 0.88505747 0.93103448]
|
|
|
|
mean value: 0.8958986415882968
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93478261 0.9010989 0.86046512 0.87640449 0.86666667 0.90909091
|
|
0.92134831 0.88888889 0.88095238 0.93333333]
|
|
|
|
mean value: 0.8973031613994567
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89583333 0.87234043 0.88095238 0.86666667 0.82978723 0.88888889
|
|
0.89130435 0.86956522 0.925 0.91304348]
|
|
|
|
mean value: 0.8833381972893999
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.93181818 0.84090909 0.88636364 0.90697674 0.93023256
|
|
0.95348837 0.90909091 0.84090909 0.95454545]
|
|
|
|
mean value: 0.9131606765327696
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93181818 0.89772727 0.86363636 0.875 0.86257928 0.9082981
|
|
0.919926 0.88477801 0.88557082 0.9307611 ]
|
|
|
|
mean value: 0.8960095137420718
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.87755102 0.82 0.75510204 0.78 0.76470588 0.83333333
|
|
0.85416667 0.8 0.78723404 0.875 ]
|
|
|
|
mean value: 0.8147092986130622
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.05537844 1.15527582 1.09323168 1.14960814 1.12652636 1.13774252
|
|
1.10354066 1.09963131 1.08902621 1.08362079]
|
|
|
|
mean value: 1.1093581914901733
|
|
|
|
key: score_time
|
|
value: [0.27662992 0.25080609 0.23210335 0.27736449 0.28904748 0.29127479
|
|
0.25754666 0.27043676 0.2904017 0.28125048]
|
|
|
|
mean value: 0.27168617248535154
|
|
|
|
key: test_mcc
|
|
value: [0.88843109 0.75174939 0.70472748 0.79566006 0.74735729 0.83932347
|
|
0.84118687 0.77077916 0.77102073 0.86289151]
|
|
|
|
mean value: 0.7973127061523191
|
|
|
|
key: train_mcc
|
|
value: [0.90915072 0.91129852 0.91627083 0.91638957 0.91910275 0.91119668
|
|
0.92405699 0.9217261 0.91129545 0.91402422]
|
|
|
|
mean value: 0.9154511824913257
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.875 0.85227273 0.89772727 0.87356322 0.91954023
|
|
0.91954023 0.88505747 0.88505747 0.93103448]
|
|
|
|
mean value: 0.8981974921630094
|
|
|
|
key: train_accuracy
|
|
value: [0.95419847 0.95547074 0.95801527 0.95801527 0.95933926 0.95552732
|
|
0.96188056 0.96060991 0.95552732 0.95679797]
|
|
|
|
mean value: 0.957538208353945
|
|
|
|
key: test_fscore
|
|
value: [0.94505495 0.87912088 0.85057471 0.8988764 0.87356322 0.91954023
|
|
0.92134831 0.88888889 0.88372093 0.93333333]
|
|
|
|
mean value: 0.8994021856651269
|
|
|
|
key: train_fscore
|
|
value: [0.95511222 0.95608532 0.95849057 0.95859473 0.96 0.95597484
|
|
0.96240602 0.9612015 0.95597484 0.95739348]
|
|
|
|
mean value: 0.9581233521836118
|
|
|
|
key: test_precision
|
|
value: [0.91489362 0.85106383 0.86046512 0.88888889 0.86363636 0.90909091
|
|
0.89130435 0.86956522 0.9047619 0.91304348]
|
|
|
|
mean value: 0.8866713672943908
|
|
|
|
key: train_precision
|
|
value: [0.93643032 0.94306931 0.94776119 0.94554455 0.94581281 0.94763092
|
|
0.95049505 0.94581281 0.94527363 0.94320988]
|
|
|
|
mean value: 0.945104046961017
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.90909091 0.84090909 0.90909091 0.88372093 0.93023256
|
|
0.95348837 0.90909091 0.86363636 0.95454545]
|
|
|
|
mean value: 0.913107822410148
|
|
|
|
key: train_recall
|
|
value: [0.97455471 0.96946565 0.96946565 0.97201018 0.97461929 0.96446701
|
|
0.97461929 0.97709924 0.96692112 0.97201018]
|
|
|
|
mean value: 0.9715232301313597
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.875 0.85227273 0.89772727 0.87367865 0.91966173
|
|
0.919926 0.88477801 0.88530655 0.9307611 ]
|
|
|
|
mean value: 0.8982293868921776
|
|
|
|
key: train_roc_auc
|
|
value: [0.95419847 0.95547074 0.95801527 0.95801527 0.95931982 0.95551595
|
|
0.96186435 0.96063084 0.95554178 0.95681727]
|
|
|
|
mean value: 0.9575389752134433
|
|
|
|
key: test_jcc
|
|
value: [0.89583333 0.78431373 0.74 0.81632653 0.7755102 0.85106383
|
|
0.85416667 0.8 0.79166667 0.875 ]
|
|
|
|
mean value: 0.8183880956637974
|
|
|
|
key: train_jcc
|
|
value: [0.91408115 0.91586538 0.92028986 0.92048193 0.92307692 0.91566265
|
|
0.92753623 0.9253012 0.91566265 0.91826923]
|
|
|
|
mean value: 0.9196227204737726
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02685237 0.01579261 0.01570487 0.01578689 0.01585865 0.01588416
|
|
0.01590824 0.0158875 0.01584315 0.01586103]
|
|
|
|
mean value: 0.016937947273254393
|
|
|
|
key: score_time
|
|
value: [0.01227856 0.01211953 0.01234627 0.01220894 0.01220393 0.01216817
|
|
0.01225019 0.01219034 0.01222849 0.01226616]
|
|
|
|
mean value: 0.012226057052612305
|
|
|
|
key: test_mcc
|
|
value: [0.68252363 0.36514837 0.34530694 0.52613536 0.61090601 0.3853797
|
|
0.52126134 0.40330006 0.54295079 0.24714945]
|
|
|
|
mean value: 0.46300616567387165
|
|
|
|
key: train_mcc
|
|
value: [0.47428882 0.51540005 0.50501003 0.4893689 0.47802164 0.49540494
|
|
0.48842804 0.50058709 0.49168322 0.49679032]
|
|
|
|
mean value: 0.4934983053216603
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.68181818 0.67045455 0.76136364 0.8045977 0.68965517
|
|
0.75862069 0.70114943 0.77011494 0.62068966]
|
|
|
|
mean value: 0.729937304075235
|
|
|
|
key: train_accuracy
|
|
value: [0.73536896 0.75699746 0.7519084 0.74300254 0.73824651 0.74714104
|
|
0.7433291 0.74968234 0.74459975 0.74714104]
|
|
|
|
mean value: 0.7457417124972922
|
|
|
|
key: test_fscore
|
|
value: [0.84444444 0.69565217 0.69473684 0.77419355 0.80898876 0.70967742
|
|
0.76923077 0.7173913 0.76190476 0.66666667]
|
|
|
|
mean value: 0.7442886694399654
|
|
|
|
key: train_fscore
|
|
value: [0.75059952 0.76564417 0.7601476 0.75721154 0.74878049 0.75582822
|
|
0.75425791 0.75768758 0.75636364 0.75878788]
|
|
|
|
mean value: 0.7565308540334024
|
|
|
|
key: test_precision
|
|
value: [0.82608696 0.66666667 0.64705882 0.73469388 0.7826087 0.66
|
|
0.72916667 0.6875 0.8 0.6 ]
|
|
|
|
mean value: 0.7133781686587679
|
|
|
|
key: train_precision
|
|
value: [0.70975057 0.73933649 0.73571429 0.71753986 0.72065728 0.73159145
|
|
0.72429907 0.73333333 0.72222222 0.72453704]
|
|
|
|
mean value: 0.725898159276402
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.72727273 0.75 0.81818182 0.8372093 0.76744186
|
|
0.81395349 0.75 0.72727273 0.75 ]
|
|
|
|
mean value: 0.7804968287526427
|
|
|
|
key: train_recall
|
|
value: [0.79643766 0.79389313 0.78625954 0.80152672 0.77918782 0.78172589
|
|
0.78680203 0.78371501 0.79389313 0.79643766]
|
|
|
|
mean value: 0.7899878585913382
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.68181818 0.67045455 0.76136364 0.80496829 0.69053911
|
|
0.75924947 0.7005814 0.77061311 0.61918605]
|
|
|
|
mean value: 0.7299682875264271
|
|
|
|
key: train_roc_auc
|
|
value: [0.73536896 0.75699746 0.7519084 0.74300254 0.73819442 0.74709704
|
|
0.74327379 0.74972553 0.7446623 0.7472036 ]
|
|
|
|
mean value: 0.7457434029526873
|
|
|
|
key: test_jcc
|
|
value: [0.73076923 0.53333333 0.53225806 0.63157895 0.67924528 0.55
|
|
0.625 0.55932203 0.61538462 0.5 ]
|
|
|
|
mean value: 0.5956891508288903
|
|
|
|
key: train_jcc
|
|
value: [0.60076775 0.62027833 0.61309524 0.60928433 0.59844055 0.60749507
|
|
0.60546875 0.60990099 0.60818713 0.61132812]
|
|
|
|
mean value: 0.6084246269566757
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.16049242 0.13136196 0.14000368 0.14280915 0.29893279 0.13998246
|
|
0.15176749 0.14539051 0.1494689 0.14014292]
|
|
|
|
mean value: 0.16003522872924805
|
|
|
|
key: score_time
|
|
value: [0.01117945 0.0121758 0.01154327 0.01159501 0.01143241 0.01127696
|
|
0.01138067 0.01289082 0.01135778 0.01130652]
|
|
|
|
mean value: 0.011613869667053222
|
|
|
|
key: test_mcc
|
|
value: [0.90909091 0.77594029 0.77272727 0.81902836 0.77008457 0.77312462
|
|
0.84118687 0.77008457 0.81972843 0.81935269]
|
|
|
|
mean value: 0.8070348585820003
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95454545 0.88636364 0.88636364 0.90909091 0.88505747 0.88505747
|
|
0.91954023 0.88505747 0.90804598 0.90804598]
|
|
|
|
mean value: 0.9027168234064786
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95454545 0.89130435 0.88636364 0.91111111 0.88372093 0.87804878
|
|
0.92134831 0.88636364 0.9047619 0.91304348]
|
|
|
|
mean value: 0.9030611594559804
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.85416667 0.88636364 0.89130435 0.88372093 0.92307692
|
|
0.89130435 0.88636364 0.95 0.875 ]
|
|
|
|
mean value: 0.8995845942901048
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.93181818 0.88636364 0.93181818 0.88372093 0.8372093
|
|
0.95348837 0.88636364 0.86363636 0.95454545]
|
|
|
|
mean value: 0.9083509513742072
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95454545 0.88636364 0.88636364 0.90909091 0.88504228 0.88451374
|
|
0.919926 0.88504228 0.90856237 0.90750529]
|
|
|
|
mean value: 0.9026955602536998
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.91304348 0.80392157 0.79591837 0.83673469 0.79166667 0.7826087
|
|
0.85416667 0.79591837 0.82608696 0.84 ]
|
|
|
|
mean value: 0.8240065460966995
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06331563 0.08405757 0.09509635 0.07134914 0.07386637 0.05402017
|
|
0.08309579 0.09428 0.09064746 0.09419298]
|
|
|
|
mean value: 0.08039214611053466
|
|
|
|
key: score_time
|
|
value: [0.02636814 0.01936841 0.02469587 0.01357269 0.01332116 0.01356101
|
|
0.02252293 0.02556062 0.02522826 0.02603841]
|
|
|
|
mean value: 0.02102375030517578
|
|
|
|
key: test_mcc
|
|
value: [0.66759342 0.63900965 0.38888266 0.52286233 0.69052856 0.65994555
|
|
0.60331932 0.61648587 0.56319416 0.5404983 ]
|
|
|
|
mean value: 0.5892319806067103
|
|
|
|
key: train_mcc
|
|
value: [0.77627846 0.78530555 0.77151353 0.79550432 0.79120432 0.79390425
|
|
0.74423479 0.79970843 0.79223524 0.77729425]
|
|
|
|
mean value: 0.7827183127675109
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.81818182 0.69318182 0.76136364 0.83908046 0.82758621
|
|
0.79310345 0.8045977 0.7816092 0.77011494]
|
|
|
|
mean value: 0.7918364681295715
|
|
|
|
key: train_accuracy
|
|
value: [0.88676845 0.89185751 0.88422392 0.89694656 0.89453621 0.89580686
|
|
0.8703939 0.89834816 0.89453621 0.88691233]
|
|
|
|
mean value: 0.8900330109831841
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.82608696 0.70967742 0.76404494 0.85106383 0.83516484
|
|
0.8125 0.82105263 0.78651685 0.77777778]
|
|
|
|
mean value: 0.8025990511096076
|
|
|
|
key: train_fscore
|
|
value: [0.89133089 0.89519112 0.88915956 0.9001233 0.89840881 0.8997555
|
|
0.87651332 0.90243902 0.89890378 0.89185905]
|
|
|
|
mean value: 0.8943684363139492
|
|
|
|
key: test_precision
|
|
value: [0.78431373 0.79166667 0.67346939 0.75555556 0.78431373 0.79166667
|
|
0.73584906 0.76470588 0.77777778 0.76086957]
|
|
|
|
mean value: 0.7620188009576266
|
|
|
|
key: train_precision
|
|
value: [0.85680751 0.86842105 0.85280374 0.87320574 0.86761229 0.86792453
|
|
0.83796296 0.86651054 0.86214953 0.85348837]
|
|
|
|
mean value: 0.8606886272167267
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.86363636 0.75 0.77272727 0.93023256 0.88372093
|
|
0.90697674 0.88636364 0.79545455 0.79545455]
|
|
|
|
mean value: 0.8493657505285412
|
|
|
|
key: train_recall
|
|
value: [0.92875318 0.92366412 0.92875318 0.92875318 0.93147208 0.93401015
|
|
0.91878173 0.94147583 0.9389313 0.93384224]
|
|
|
|
mean value: 0.9308436987380685
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.81818182 0.69318182 0.76136364 0.84011628 0.8282241
|
|
0.79439746 0.80364693 0.7814482 0.7698203 ]
|
|
|
|
mean value: 0.7919926004228329
|
|
|
|
key: train_roc_auc
|
|
value: [0.88676845 0.89185751 0.88422392 0.89694656 0.89448922 0.89575826
|
|
0.87033234 0.89840289 0.89459255 0.88697188]
|
|
|
|
mean value: 0.8900343576032342
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.7037037 0.55 0.61818182 0.74074074 0.71698113
|
|
0.68421053 0.69642857 0.64814815 0.63636364]
|
|
|
|
mean value: 0.6722031004230606
|
|
|
|
key: train_jcc
|
|
value: [0.80396476 0.81026786 0.8004386 0.81838565 0.81555556 0.81777778
|
|
0.78017241 0.82222222 0.81637168 0.80482456]
|
|
|
|
mean value: 0.8089981073735648
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02132583 0.01521826 0.01540136 0.01523566 0.01541209 0.01545763
|
|
0.01539135 0.01569581 0.01536846 0.01550746]
|
|
|
|
mean value: 0.016001391410827636
|
|
|
|
key: score_time
|
|
value: [0.02263594 0.01224017 0.01216555 0.01229405 0.01220536 0.01214266
|
|
0.01215577 0.01224852 0.01231098 0.01221323]
|
|
|
|
mean value: 0.013261222839355468
|
|
|
|
key: test_mcc
|
|
value: [0.50847518 0.40951418 0.3640126 0.60092521 0.58699109 0.51744186
|
|
0.46196713 0.24188306 0.42547569 0.40794313]
|
|
|
|
mean value: 0.45246291247679393
|
|
|
|
key: train_mcc
|
|
value: [0.47250952 0.48272595 0.47741223 0.47563022 0.4743098 0.47161961
|
|
0.48354281 0.48303054 0.48343926 0.47368777]
|
|
|
|
mean value: 0.47779077011814536
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.70454545 0.68181818 0.79545455 0.79310345 0.75862069
|
|
0.72413793 0.62068966 0.71264368 0.70114943]
|
|
|
|
mean value: 0.7242163009404389
|
|
|
|
key: train_accuracy
|
|
value: [0.73536896 0.74045802 0.73791349 0.73664122 0.73570521 0.73443456
|
|
0.7407878 0.7407878 0.7407878 0.73570521]
|
|
|
|
mean value: 0.7378590065666314
|
|
|
|
key: test_fscore
|
|
value: [0.77083333 0.71111111 0.68888889 0.8125 0.79545455 0.75862069
|
|
0.75 0.64516129 0.71264368 0.72916667]
|
|
|
|
mean value: 0.7374380203593218
|
|
|
|
key: train_fscore
|
|
value: [0.74634146 0.75121951 0.74816626 0.74909091 0.75 0.74849579
|
|
0.75242718 0.75 0.75121951 0.74757282]
|
|
|
|
mean value: 0.7494533444271472
|
|
|
|
key: test_precision
|
|
value: [0.71153846 0.69565217 0.67391304 0.75 0.77777778 0.75
|
|
0.67924528 0.6122449 0.72093023 0.67307692]
|
|
|
|
mean value: 0.7044378793320658
|
|
|
|
key: train_precision
|
|
value: [0.71662763 0.72131148 0.72 0.71527778 0.71232877 0.71167048
|
|
0.72093023 0.72340426 0.72131148 0.71461717]
|
|
|
|
mean value: 0.7177479268181197
|
|
|
|
key: test_recall
|
|
value: [0.84090909 0.72727273 0.70454545 0.88636364 0.81395349 0.76744186
|
|
0.8372093 0.68181818 0.70454545 0.79545455]
|
|
|
|
mean value: 0.7759513742071882
|
|
|
|
key: train_recall
|
|
value: [0.77862595 0.78371501 0.77862595 0.78625954 0.79187817 0.7893401
|
|
0.78680203 0.77862595 0.78371501 0.78371501]
|
|
|
|
mean value: 0.784130274731662
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.70454545 0.68181818 0.79545455 0.79334038 0.75872093
|
|
0.72542283 0.61997886 0.71273784 0.70005285]
|
|
|
|
mean value: 0.7242071881606765
|
|
|
|
key: train_roc_auc
|
|
value: [0.73536896 0.74045802 0.73791349 0.73664122 0.73563374 0.73436471
|
|
0.74072926 0.74083582 0.74084228 0.73576614]
|
|
|
|
mean value: 0.7378553622402191
|
|
|
|
key: test_jcc
|
|
value: [0.62711864 0.55172414 0.52542373 0.68421053 0.66037736 0.61111111
|
|
0.6 0.47619048 0.55357143 0.57377049]
|
|
|
|
mean value: 0.5863497903295041
|
|
|
|
key: train_jcc
|
|
value: [0.59533074 0.6015625 0.59765625 0.59883721 0.6 0.59807692
|
|
0.60311284 0.6 0.6015625 0.59689922]
|
|
|
|
mean value: 0.5993038186951987
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02958322 0.02433634 0.02579641 0.0442853 0.03078198 0.02677584
|
|
0.02552652 0.02910757 0.02560163 0.02800775]
|
|
|
|
mean value: 0.028980255126953125
|
|
|
|
key: score_time
|
|
value: [0.01246619 0.01228714 0.01220798 0.01227021 0.01223183 0.01228809
|
|
0.01225042 0.01228547 0.01230955 0.0122788 ]
|
|
|
|
mean value: 0.012287569046020509
|
|
|
|
key: test_mcc
|
|
value: [0.63636364 0.3796283 0.33241884 0.53674504 0.75739672 0.56342495
|
|
0.59245365 0.35625628 0.61774328 0.15163988]
|
|
|
|
mean value: 0.4924070576625966
|
|
|
|
key: train_mcc
|
|
value: [0.70851405 0.46547729 0.45428788 0.57137778 0.72539042 0.71336542
|
|
0.67528879 0.31755015 0.6011329 0.16168478]
|
|
|
|
mean value: 0.5394069469480227
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.65909091 0.625 0.75 0.87356322 0.7816092
|
|
0.79310345 0.6091954 0.79310345 0.51724138]
|
|
|
|
mean value: 0.7220088819226751
|
|
|
|
key: train_accuracy
|
|
value: [0.8524173 0.68956743 0.67430025 0.75826972 0.86022872 0.85641677
|
|
0.83735705 0.59593393 0.77382465 0.52604828]
|
|
|
|
mean value: 0.742436411017456
|
|
|
|
key: test_fscore
|
|
value: [0.81818182 0.53125 0.44067797 0.69444444 0.88172043 0.7816092
|
|
0.80434783 0.37037037 0.82352941 0.08695652]
|
|
|
|
mean value: 0.6233087984198946
|
|
|
|
key: train_fscore
|
|
value: [0.84450402 0.56272401 0.52059925 0.69255663 0.86810552 0.8538163
|
|
0.83419689 0.32627119 0.81223629 0.0968523 ]
|
|
|
|
mean value: 0.6411862401536421
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.85 0.86666667 0.89285714 0.82 0.77272727
|
|
0.75510204 1. 0.72413793 1. ]
|
|
|
|
mean value: 0.849967287228371
|
|
|
|
key: train_precision
|
|
value: [0.89235127 0.95151515 0.9858156 0.95111111 0.82272727 0.8707124
|
|
0.85185185 0.97468354 0.69369369 1. ]
|
|
|
|
mean value: 0.8994461903882702
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.38636364 0.29545455 0.56818182 0.95348837 0.79069767
|
|
0.86046512 0.22727273 0.95454545 0.04545455]
|
|
|
|
mean value: 0.5900105708245243
|
|
|
|
key: train_recall
|
|
value: [0.80152672 0.39949109 0.35368957 0.54452926 0.91878173 0.83756345
|
|
0.81725888 0.19592875 0.97964377 0.05089059]
|
|
|
|
mean value: 0.589930380646078
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.65909091 0.625 0.75 0.87447146 0.78171247
|
|
0.79386892 0.61363636 0.79122622 0.52272727]
|
|
|
|
mean value: 0.7229915433403805
|
|
|
|
key: train_roc_auc
|
|
value: [0.8524173 0.68956743 0.67430025 0.75826972 0.86015422 0.85644076
|
|
0.83738262 0.59542631 0.77408584 0.52544529]
|
|
|
|
mean value: 0.7423489750842794
|
|
|
|
key: test_jcc
|
|
value: [0.69230769 0.36170213 0.2826087 0.53191489 0.78846154 0.64150943
|
|
0.67272727 0.22727273 0.7 0.04545455]
|
|
|
|
mean value: 0.494395892711481
|
|
|
|
key: train_jcc
|
|
value: [0.73085847 0.3915212 0.35189873 0.52970297 0.76694915 0.74492099
|
|
0.71555556 0.19493671 0.68383659 0.05089059]
|
|
|
|
mean value: 0.5161070955285676
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02404928 0.03382921 0.04285789 0.03751874 0.03870797 0.02889562
|
|
0.03123188 0.03173637 0.03078961 0.03123069]
|
|
|
|
mean value: 0.03308472633361816
|
|
|
|
key: score_time
|
|
value: [0.01227283 0.01235342 0.01227856 0.01233625 0.01231527 0.01219463
|
|
0.01230526 0.02353597 0.0126853 0.01253462]
|
|
|
|
mean value: 0.013481211662292481
|
|
|
|
key: test_mcc
|
|
value: [0.70472748 0.5933661 0.46225016 0.57188626 0.65539112 0.19116707
|
|
0.61789034 0.5504913 0.63213531 0.64236223]
|
|
|
|
mean value: 0.5621667377709061
|
|
|
|
key: train_mcc
|
|
value: [0.72552928 0.76845498 0.73666335 0.75203977 0.74647425 0.20313543
|
|
0.74500921 0.68146897 0.70781234 0.7233087 ]
|
|
|
|
mean value: 0.6789896271218194
|
|
|
|
key: test_accuracy
|
|
value: [0.85227273 0.79545455 0.72727273 0.78409091 0.82758621 0.54022989
|
|
0.8045977 0.77011494 0.81609195 0.81609195]
|
|
|
|
mean value: 0.773380355276907
|
|
|
|
key: train_accuracy
|
|
value: [0.86132316 0.88295165 0.86386768 0.87531807 0.8729352 0.54129606
|
|
0.8703939 0.8360864 0.85387548 0.85387548]
|
|
|
|
mean value: 0.8311923075679538
|
|
|
|
key: test_fscore
|
|
value: [0.85393258 0.80434783 0.75 0.79569892 0.82758621 0.13043478
|
|
0.8172043 0.75 0.81818182 0.83333333]
|
|
|
|
mean value: 0.738071977718347
|
|
|
|
key: train_fscore
|
|
value: [0.85486019 0.88753056 0.87367178 0.87901235 0.87562189 0.15850816
|
|
0.87710843 0.82108183 0.85461441 0.86735871]
|
|
|
|
mean value: 0.7949368311113626
|
|
|
|
key: test_precision
|
|
value: [0.84444444 0.77083333 0.69230769 0.75510204 0.81818182 1.
|
|
0.76 0.83333333 0.81818182 0.76923077]
|
|
|
|
mean value: 0.8061615249829536
|
|
|
|
key: train_precision
|
|
value: [0.89664804 0.85411765 0.81497797 0.85371703 0.85853659 0.97142857
|
|
0.83486239 0.90243902 0.84924623 0.79324895]
|
|
|
|
mean value: 0.8629222434507968
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.84090909 0.81818182 0.84090909 0.8372093 0.06976744
|
|
0.88372093 0.68181818 0.81818182 0.90909091]
|
|
|
|
mean value: 0.7563424947145877
|
|
|
|
key: train_recall
|
|
value: [0.81679389 0.92366412 0.94147583 0.90585242 0.89340102 0.08629442
|
|
0.92385787 0.75318066 0.86005089 0.956743 ]
|
|
|
|
mean value: 0.806131411374175
|
|
|
|
key: test_roc_auc
|
|
value: [0.85227273 0.79545455 0.72727273 0.78409091 0.82769556 0.53488372
|
|
0.80549683 0.77114165 0.81606765 0.81501057]
|
|
|
|
mean value: 0.772938689217759
|
|
|
|
key: train_roc_auc
|
|
value: [0.86132316 0.88295165 0.86386768 0.87531807 0.87290916 0.54187494
|
|
0.87032588 0.83598119 0.85388331 0.85400602]
|
|
|
|
mean value: 0.8312441068960618
|
|
|
|
key: test_jcc
|
|
value: [0.74509804 0.67272727 0.6 0.66071429 0.70588235 0.06976744
|
|
0.69090909 0.6 0.69230769 0.71428571]
|
|
|
|
mean value: 0.6151691889961384
|
|
|
|
key: train_jcc
|
|
value: [0.74651163 0.7978022 0.77568134 0.78414097 0.77876106 0.08607595
|
|
0.78111588 0.69647059 0.74613687 0.76578411]
|
|
|
|
mean value: 0.6958480595363976
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.27864838 0.27138972 0.2752707 0.27588749 0.2660768 0.27190518
|
|
0.27267504 0.27442336 0.26846004 0.26556134]
|
|
|
|
mean value: 0.27202980518341063
|
|
|
|
key: score_time
|
|
value: [0.01731539 0.01707101 0.01773667 0.01637697 0.01633453 0.01748466
|
|
0.01726389 0.01764894 0.01723289 0.01652169]
|
|
|
|
mean value: 0.017098665237426758
|
|
|
|
key: test_mcc
|
|
value: [0.86363636 0.75174939 0.73413035 0.81818182 0.81683533 0.7472238
|
|
0.7951307 0.72689655 0.79323121 0.81935269]
|
|
|
|
mean value: 0.7866368198854106
|
|
|
|
key: train_mcc
|
|
value: [0.88323763 0.89581641 0.90099965 0.90360046 0.90632277 0.89603807
|
|
0.87806138 0.91112948 0.91370369 0.89096032]
|
|
|
|
mean value: 0.8979869877722635
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.875 0.86363636 0.90909091 0.90804598 0.87356322
|
|
0.89655172 0.86206897 0.89655172 0.90804598]
|
|
|
|
mean value: 0.8924373040752351
|
|
|
|
key: train_accuracy
|
|
value: [0.94147583 0.94783715 0.95038168 0.95165394 0.95298602 0.94790343
|
|
0.93900889 0.95552732 0.95679797 0.94536213]
|
|
|
|
mean value: 0.9488934369250964
|
|
|
|
key: test_fscore
|
|
value: [0.93181818 0.87912088 0.87234043 0.90909091 0.9047619 0.87058824
|
|
0.8988764 0.86956522 0.8988764 0.91304348]
|
|
|
|
mean value: 0.8948082040258845
|
|
|
|
key: train_fscore
|
|
value: [0.94221106 0.9482976 0.9509434 0.95226131 0.95369212 0.94855709
|
|
0.93939394 0.95575221 0.95707071 0.94591195]
|
|
|
|
mean value: 0.9494091374838326
|
|
|
|
key: test_precision
|
|
value: [0.93181818 0.85106383 0.82 0.90909091 0.92682927 0.88095238
|
|
0.86956522 0.83333333 0.88888889 0.875 ]
|
|
|
|
mean value: 0.8786542009554915
|
|
|
|
key: train_precision
|
|
value: [0.93052109 0.94 0.94029851 0.94044665 0.94074074 0.93796526
|
|
0.93467337 0.94974874 0.94987469 0.93532338]
|
|
|
|
mean value: 0.939959243103895
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.90909091 0.93181818 0.90909091 0.88372093 0.86046512
|
|
0.93023256 0.90909091 0.90909091 0.95454545]
|
|
|
|
mean value: 0.9128964059196617
|
|
|
|
key: train_recall
|
|
value: [0.95419847 0.956743 0.96183206 0.96437659 0.96700508 0.95939086
|
|
0.94416244 0.96183206 0.96437659 0.956743 ]
|
|
|
|
mean value: 0.9590660156805001
|
|
|
|
key: test_roc_auc
|
|
value: [0.93181818 0.875 0.86363636 0.90909091 0.90776956 0.87341438
|
|
0.89693446 0.8615222 0.89640592 0.90750529]
|
|
|
|
mean value: 0.8923097251585623
|
|
|
|
key: train_roc_auc
|
|
value: [0.94147583 0.94783715 0.95038168 0.95165394 0.95296819 0.94788882
|
|
0.93900234 0.95553532 0.95680758 0.94537658]
|
|
|
|
mean value: 0.9488927422792266
|
|
|
|
key: test_jcc
|
|
value: [0.87234043 0.78431373 0.77358491 0.83333333 0.82608696 0.77083333
|
|
0.81632653 0.76923077 0.81632653 0.84 ]
|
|
|
|
mean value: 0.8102376510326154
|
|
|
|
key: train_jcc
|
|
value: [0.89073634 0.90167866 0.90647482 0.9088729 0.91148325 0.90214797
|
|
0.88571429 0.91525424 0.91767554 0.8973747 ]
|
|
|
|
mean value: 0.903741271535579
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.21593761 0.12754798 0.21600103 0.12391162 0.22773957 0.23310304
|
|
0.23577619 0.22948527 0.23245597 0.24050713]
|
|
|
|
mean value: 0.2082465410232544
|
|
|
|
key: score_time
|
|
value: [0.04016471 0.02497435 0.04214454 0.01842999 0.04276419 0.03724957
|
|
0.04361939 0.04263687 0.04333544 0.03482246]
|
|
|
|
mean value: 0.03701415061950684
|
|
|
|
key: test_mcc
|
|
value: [0.88659264 0.75174939 0.77352678 0.79566006 0.81702814 0.72410148
|
|
0.81683533 0.79480784 0.75240169 0.88524603]
|
|
|
|
mean value: 0.7997949390227214
|
|
|
|
key: train_mcc
|
|
value: [0.98735727 0.98728055 0.98221691 0.97717833 0.98988607 0.97478912
|
|
0.98476502 0.98737301 0.97971996 0.98228992]
|
|
|
|
mean value: 0.983285614378063
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.875 0.88636364 0.89772727 0.90804598 0.86206897
|
|
0.90804598 0.89655172 0.87356322 0.94252874]
|
|
|
|
mean value: 0.8993077324973877
|
|
|
|
key: train_accuracy
|
|
value: [0.99363868 0.99363868 0.99109415 0.98854962 0.99491741 0.98729352
|
|
0.99237611 0.99364676 0.98983482 0.99110546]
|
|
|
|
mean value: 0.9916095198373053
|
|
|
|
key: test_fscore
|
|
value: [0.94252874 0.87912088 0.88372093 0.8988764 0.90909091 0.86046512
|
|
0.9047619 0.9010989 0.86746988 0.94382022]
|
|
|
|
mean value: 0.8990953884947962
|
|
|
|
key: train_fscore
|
|
value: [0.99359795 0.99363057 0.99106003 0.98847631 0.99489796 0.98717949
|
|
0.99236641 0.99359795 0.98976982 0.99103713]
|
|
|
|
mean value: 0.9915613625330997
|
|
|
|
key: test_precision
|
|
value: [0.95348837 0.85106383 0.9047619 0.88888889 0.88888889 0.86046512
|
|
0.92682927 0.87234043 0.92307692 0.93333333]
|
|
|
|
mean value: 0.9003136950933864
|
|
|
|
key: train_precision
|
|
value: [1. 0.99489796 0.99487179 0.99484536 1. 0.99740933
|
|
0.99489796 1. 0.99485861 0.99742268]
|
|
|
|
mean value: 0.9969203692726318
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.90909091 0.86363636 0.90909091 0.93023256 0.86046512
|
|
0.88372093 0.93181818 0.81818182 0.95454545]
|
|
|
|
mean value: 0.8992600422832981
|
|
|
|
key: train_recall
|
|
value: [0.98727735 0.99236641 0.98727735 0.9821883 0.98984772 0.97715736
|
|
0.98984772 0.98727735 0.98473282 0.98473282]
|
|
|
|
mean value: 0.9862705209180972
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.875 0.88636364 0.89772727 0.9082981 0.86205074
|
|
0.90776956 0.89614165 0.87420719 0.94238901]
|
|
|
|
mean value: 0.8993128964059196
|
|
|
|
key: train_roc_auc
|
|
value: [0.99363868 0.99363868 0.99109415 0.98854962 0.99492386 0.98730642
|
|
0.99237933 0.99363868 0.98982834 0.99109738]
|
|
|
|
mean value: 0.9916095116312111
|
|
|
|
key: test_jcc
|
|
value: [0.89130435 0.78431373 0.79166667 0.81632653 0.83333333 0.75510204
|
|
0.82608696 0.82 0.76595745 0.89361702]
|
|
|
|
mean value: 0.81777080693517
|
|
|
|
key: train_jcc
|
|
value: [0.98727735 0.98734177 0.98227848 0.97721519 0.98984772 0.97468354
|
|
0.98484848 0.98727735 0.97974684 0.9822335 ]
|
|
|
|
mean value: 0.9832750233286541
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.37668085 0.38118434 0.36029553 0.38846707 0.28293347 0.33020926
|
|
0.3172996 0.28086233 0.47087789 0.33524823]
|
|
|
|
mean value: 0.35240585803985597
|
|
|
|
key: score_time
|
|
value: [0.03350472 0.0394361 0.01947856 0.03539586 0.03546238 0.01960278
|
|
0.01966977 0.0196321 0.02036977 0.02100039]
|
|
|
|
mean value: 0.02635524272918701
|
|
|
|
key: test_mcc
|
|
value: [0.67332702 0.41079192 0.59090909 0.54601891 0.59245365 0.59717038
|
|
0.44896886 0.35707199 0.51744186 0.54198427]
|
|
|
|
mean value: 0.5276137954634224
|
|
|
|
key: train_mcc
|
|
value: [0.92486253 0.90935126 0.9146712 0.91009558 0.91454821 0.91260837
|
|
0.91502105 0.91530625 0.91478356 0.92189349]
|
|
|
|
mean value: 0.9153141494896055
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.70454545 0.79545455 0.77272727 0.79310345 0.79310345
|
|
0.72413793 0.67816092 0.75862069 0.77011494]
|
|
|
|
mean value: 0.7619514106583072
|
|
|
|
key: train_accuracy
|
|
value: [0.96183206 0.95419847 0.956743 0.95419847 0.95679797 0.95552732
|
|
0.95679797 0.95679797 0.95679797 0.96060991]
|
|
|
|
mean value: 0.9570301108018016
|
|
|
|
key: test_fscore
|
|
value: [0.84536082 0.7173913 0.79545455 0.77777778 0.80434783 0.80851064
|
|
0.72727273 0.69565217 0.75862069 0.7826087 ]
|
|
|
|
mean value: 0.7712997203200364
|
|
|
|
key: train_fscore
|
|
value: [0.96277916 0.95522388 0.95781638 0.95555556 0.95781638 0.9568434
|
|
0.95802469 0.95802469 0.95781638 0.96129838]
|
|
|
|
mean value: 0.9581198886944443
|
|
|
|
key: test_precision
|
|
value: [0.77358491 0.6875 0.79545455 0.76086957 0.75510204 0.74509804
|
|
0.71111111 0.66666667 0.76744186 0.75 ]
|
|
|
|
mean value: 0.7412828734607221
|
|
|
|
key: train_precision
|
|
value: [0.93946731 0.93430657 0.9346247 0.92805755 0.9368932 0.93045564
|
|
0.93269231 0.93045564 0.9346247 0.94362745]
|
|
|
|
mean value: 0.9345205063861101
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.75 0.79545455 0.79545455 0.86046512 0.88372093
|
|
0.74418605 0.72727273 0.75 0.81818182]
|
|
|
|
mean value: 0.8056553911205074
|
|
|
|
key: train_recall
|
|
value: [0.98727735 0.97709924 0.9821883 0.98473282 0.97969543 0.98477157
|
|
0.98477157 0.98727735 0.9821883 0.97964377]
|
|
|
|
mean value: 0.9829645703362136
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.70454545 0.79545455 0.77272727 0.79386892 0.79413319
|
|
0.72436575 0.67758985 0.75872093 0.76955603]
|
|
|
|
mean value: 0.7620507399577168
|
|
|
|
key: train_roc_auc
|
|
value: [0.96183206 0.95419847 0.956743 0.95419847 0.95676884 0.95549011
|
|
0.95676238 0.95683665 0.95683019 0.96063407]
|
|
|
|
mean value: 0.9570294235414164
|
|
|
|
key: test_jcc
|
|
value: [0.73214286 0.55932203 0.66037736 0.63636364 0.67272727 0.67857143
|
|
0.57142857 0.53333333 0.61111111 0.64285714]
|
|
|
|
mean value: 0.6298234745924225
|
|
|
|
key: train_jcc
|
|
value: [0.92822967 0.91428571 0.91904762 0.91489362 0.91904762 0.91725768
|
|
0.91943128 0.91943128 0.91904762 0.92548077]
|
|
|
|
mean value: 0.9196152865209224
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.16928625 1.14868641 1.13823104 1.14943147 1.14699316 1.14772367
|
|
1.1377604 1.14932823 1.15342426 1.15133929]
|
|
|
|
mean value: 1.1492204189300537
|
|
|
|
key: score_time
|
|
value: [0.0095892 0.0095458 0.00968099 0.0096097 0.009974 0.0093739
|
|
0.00960636 0.00960088 0.01036811 0.00943041]
|
|
|
|
mean value: 0.009677934646606445
|
|
|
|
key: test_mcc
|
|
value: [0.90909091 0.75174939 0.84112635 0.82158384 0.7951307 0.83923862
|
|
0.90904296 0.79480784 0.81972843 0.84093745]
|
|
|
|
mean value: 0.8322436481865038
|
|
|
|
key: train_mcc
|
|
value: [0.94659247 0.9694782 0.964489 0.96951587 0.96447209 0.96453461
|
|
0.96203358 0.96961653 0.96443403 0.9542566 ]
|
|
|
|
mean value: 0.9629422979060673
|
|
|
|
key: test_accuracy
|
|
value: [0.95454545 0.875 0.92045455 0.90909091 0.89655172 0.91954023
|
|
0.95402299 0.89655172 0.90804598 0.91954023]
|
|
|
|
mean value: 0.9153343782654128
|
|
|
|
key: train_accuracy
|
|
value: [0.97328244 0.98473282 0.9821883 0.98473282 0.98221093 0.98221093
|
|
0.98094028 0.98475222 0.98221093 0.97712834]
|
|
|
|
mean value: 0.9814390008115335
|
|
|
|
key: test_fscore
|
|
value: [0.95454545 0.87912088 0.92134831 0.91304348 0.8988764 0.91764706
|
|
0.95454545 0.9010989 0.9047619 0.92307692]
|
|
|
|
mean value: 0.916806477333504
|
|
|
|
key: train_fscore
|
|
value: [0.97318008 0.98469388 0.98205128 0.98465473 0.98214286 0.98209719
|
|
0.98079385 0.98461538 0.98214286 0.97709924]
|
|
|
|
mean value: 0.9813471343964834
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.85106383 0.91111111 0.875 0.86956522 0.92857143
|
|
0.93333333 0.87234043 0.95 0.89361702]
|
|
|
|
mean value: 0.9039147821548377
|
|
|
|
key: train_precision
|
|
value: [0.97692308 0.98721228 0.98966408 0.98971722 0.98717949 0.98969072
|
|
0.98966408 0.99224806 0.98465473 0.97709924]
|
|
|
|
mean value: 0.986405298110647
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.90909091 0.93181818 0.95454545 0.93023256 0.90697674
|
|
0.97674419 0.93181818 0.86363636 0.95454545]
|
|
|
|
mean value: 0.9313953488372093
|
|
|
|
key: train_recall
|
|
value: [0.96946565 0.9821883 0.97455471 0.97964377 0.97715736 0.97461929
|
|
0.97208122 0.97709924 0.97964377 0.97709924]
|
|
|
|
mean value: 0.9763552524508854
|
|
|
|
key: test_roc_auc
|
|
value: [0.95454545 0.875 0.92045455 0.90909091 0.89693446 0.91939746
|
|
0.95428118 0.89614165 0.90856237 0.91913319]
|
|
|
|
mean value: 0.9153541226215645
|
|
|
|
key: train_roc_auc
|
|
value: [0.97328244 0.98473282 0.9821883 0.98473282 0.98221736 0.98222059
|
|
0.98095155 0.98474251 0.98220767 0.9771283 ]
|
|
|
|
mean value: 0.9814404360574004
|
|
|
|
key: test_jcc
|
|
value: [0.91304348 0.78431373 0.85416667 0.84 0.81632653 0.84782609
|
|
0.91304348 0.82 0.82608696 0.85714286]
|
|
|
|
mean value: 0.8471949779911965
|
|
|
|
key: train_jcc
|
|
value: [0.94776119 0.96984925 0.96473552 0.9697733 0.96491228 0.96482412
|
|
0.96231156 0.96969697 0.96491228 0.95522388]
|
|
|
|
mean value: 0.9634000346471366
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04250526 0.04080033 0.04067802 0.03934932 0.04005003 0.04622293
|
|
0.04257751 0.03894424 0.03844476 0.04671168]
|
|
|
|
mean value: 0.04162840843200684
|
|
|
|
key: score_time
|
|
value: [0.01281166 0.01308036 0.01333427 0.01313806 0.01291275 0.01286411
|
|
0.01297784 0.01300836 0.01292276 0.0129571 ]
|
|
|
|
mean value: 0.013000726699829102
|
|
|
|
key: test_mcc
|
|
value: [0.15811388 0.12598816 0.04908807 0.09016696 0.20790225 0.04655125
|
|
0.14533074 0.13018178 0.14830203 0.19262997]
|
|
|
|
mean value: 0.12942550981727555
|
|
|
|
key: train_mcc
|
|
value: [0.23759548 0.25503069 0.26889699 0.25503069 0.2316976 0.25233931
|
|
0.25800118 0.27907707 0.27376251 0.23713446]
|
|
|
|
mean value: 0.25485659746651074
|
|
|
|
key: test_accuracy
|
|
value: [0.54545455 0.53409091 0.51136364 0.52272727 0.55172414 0.50574713
|
|
0.52873563 0.54022989 0.55172414 0.56321839]
|
|
|
|
mean value: 0.5355015673981192
|
|
|
|
key: train_accuracy
|
|
value: [0.55343511 0.5610687 0.56743003 0.5610687 0.55146125 0.56035578
|
|
0.56289708 0.57179161 0.56925032 0.55273189]
|
|
|
|
mean value: 0.5611490473372972
|
|
|
|
key: test_fscore
|
|
value: [0.67741935 0.672 0.66141732 0.66666667 0.68292683 0.656
|
|
0.672 0.67741935 0.67768595 0.68852459]
|
|
|
|
mean value: 0.6732060069024183
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
|
|
key: train_fscore
|
|
value: [0.69129288 0.69496021 0.69804618 0.69496021 0.69062226 0.69488536
|
|
0.69611307 0.69991095 0.69866667 0.69068541]
|
|
|
|
mean value: 0.695014321097323
|
|
|
|
key: test_precision
|
|
value: [0.525 0.51851852 0.5060241 0.51219512 0.525 0.5
|
|
0.51219512 0.525 0.53246753 0.53846154]
|
|
|
|
mean value: 0.519486192973557
|
|
|
|
key: train_precision
|
|
value: [0.52822581 0.53252033 0.5361528 0.53252033 0.52744311 0.53243243
|
|
0.53387534 0.53835616 0.53688525 0.52751678]
|
|
|
|
mean value: 0.5325928319334771
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.95454545 0.95454545 0.95454545 0.97674419 0.95348837
|
|
0.97674419 0.95454545 0.93181818 0.95454545]
|
|
|
|
mean value: 0.9566067653276956
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.54545455 0.53409091 0.51136364 0.52272727 0.55655391 0.5108351
|
|
0.53382664 0.53541226 0.54730444 0.55866808]
|
|
|
|
mean value: 0.5356236786469345
|
|
|
|
key: train_roc_auc
|
|
value: [0.55343511 0.5610687 0.56743003 0.5610687 0.55089059 0.55979644
|
|
0.56234097 0.57233503 0.56979695 0.55329949]
|
|
|
|
mean value: 0.5611462006432363
|
|
|
|
key: test_jcc
|
|
value: [0.51219512 0.5060241 0.49411765 0.5 0.51851852 0.48809524
|
|
0.5060241 0.51219512 0.5125 0.525 ]
|
|
|
|
mean value: 0.5074669840346103
|
|
|
|
key: train_jcc
|
|
value: [0.52822581 0.53252033 0.5361528 0.53252033 0.52744311 0.53243243
|
|
0.53387534 0.53835616 0.53688525 0.52751678]
|
|
|
|
mean value: 0.5325928319334771
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02579308 0.04292941 0.04280329 0.04076314 0.04551339 0.04232264
|
|
0.0476377 0.042732 0.04306769 0.04305148]
|
|
|
|
mean value: 0.04166138172149658
|
|
|
|
key: score_time
|
|
value: [0.01925159 0.01919985 0.01918101 0.01926923 0.01917887 0.01919866
|
|
0.01924324 0.0191319 0.01925159 0.01928449]
|
|
|
|
mean value: 0.019219040870666504
|
|
|
|
key: test_mcc
|
|
value: [0.67332702 0.5933661 0.52394654 0.52394654 0.74735729 0.61371748
|
|
0.63065834 0.64863047 0.61028941 0.65641902]
|
|
|
|
mean value: 0.6221658218271912
|
|
|
|
key: train_mcc
|
|
value: [0.7518026 0.75042988 0.72860483 0.75891598 0.74972171 0.74736363
|
|
0.73478421 0.76235948 0.76162178 0.73804273]
|
|
|
|
mean value: 0.7483646825055856
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.79545455 0.76136364 0.76136364 0.87356322 0.8045977
|
|
0.8045977 0.81609195 0.8045977 0.82758621]
|
|
|
|
mean value: 0.8078761755485894
|
|
|
|
key: train_accuracy
|
|
value: [0.8740458 0.8740458 0.86259542 0.8778626 0.8729352 0.87166455
|
|
0.86531131 0.87928844 0.87928844 0.86658196]
|
|
|
|
mean value: 0.8723619503962288
|
|
|
|
key: test_fscore
|
|
value: [0.84536082 0.80434783 0.76923077 0.76923077 0.87356322 0.81318681
|
|
0.82474227 0.83673469 0.81318681 0.83516484]
|
|
|
|
mean value: 0.8184748831138817
|
|
|
|
key: train_fscore
|
|
value: [0.88 0.87882497 0.86893204 0.88321168 0.87922705 0.87816647
|
|
0.87228916 0.88484848 0.88428745 0.87364621]
|
|
|
|
mean value: 0.8783433511013907
|
|
|
|
key: test_precision
|
|
value: [0.77358491 0.77083333 0.74468085 0.74468085 0.86363636 0.77083333
|
|
0.74074074 0.75925926 0.78723404 0.80851064]
|
|
|
|
mean value: 0.7763994318942131
|
|
|
|
key: train_precision
|
|
value: [0.84027778 0.84669811 0.83062645 0.84615385 0.83870968 0.83678161
|
|
0.83027523 0.84490741 0.84813084 0.82876712]
|
|
|
|
mean value: 0.839132807504431
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.84090909 0.79545455 0.79545455 0.88372093 0.86046512
|
|
0.93023256 0.93181818 0.84090909 0.86363636]
|
|
|
|
mean value: 0.8674418604651163
|
|
|
|
key: train_recall
|
|
value: [0.92366412 0.91348601 0.91094148 0.92366412 0.92385787 0.92385787
|
|
0.91878173 0.92875318 0.92366412 0.92366412]
|
|
|
|
mean value: 0.9214334612056161
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.79545455 0.76136364 0.76136364 0.87367865 0.80523256
|
|
0.80602537 0.8147463 0.80417548 0.82716702]
|
|
|
|
mean value: 0.8078752642706131
|
|
|
|
key: train_roc_auc
|
|
value: [0.8740458 0.8740458 0.86259542 0.8778626 0.87287041 0.87159815
|
|
0.86524328 0.87935121 0.87934475 0.8666544 ]
|
|
|
|
mean value: 0.8723611810749021
|
|
|
|
key: test_jcc
|
|
value: [0.73214286 0.67272727 0.625 0.625 0.7755102 0.68518519
|
|
0.70175439 0.71929825 0.68518519 0.71698113]
|
|
|
|
mean value: 0.6938784467976552
|
|
|
|
key: train_jcc
|
|
value: [0.78571429 0.78384279 0.76824034 0.79084967 0.78448276 0.7827957
|
|
0.77350427 0.79347826 0.79257642 0.77564103]
|
|
|
|
mean value: 0.7831125533798624
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:136: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:139: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.32069945 0.18340683 0.38573503 0.23420286 0.37261057 0.35421038
|
|
0.28985262 0.33406305 0.21048737 0.33007169]
|
|
|
|
mean value: 0.30153398513793944
|
|
|
|
key: score_time
|
|
value: [0.01916289 0.01916862 0.02289367 0.02675509 0.01928449 0.01241636
|
|
0.02552414 0.01919389 0.01500607 0.01925492]
|
|
|
|
mean value: 0.019866013526916505
|
|
|
|
key: test_mcc
|
|
value: [0.67332702 0.59648091 0.52394654 0.52394654 0.71089459 0.65994555
|
|
0.62350092 0.62173301 0.60920157 0.65641902]
|
|
|
|
mean value: 0.6199395673145706
|
|
|
|
key: train_mcc
|
|
value: [0.7518026 0.76814463 0.72860483 0.75891598 0.78234745 0.77861045
|
|
0.75483144 0.78239885 0.79733748 0.73804273]
|
|
|
|
mean value: 0.7641036458050515
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.79545455 0.76136364 0.76136364 0.85057471 0.82758621
|
|
0.8045977 0.8045977 0.8045977 0.82758621]
|
|
|
|
mean value: 0.806726750261233
|
|
|
|
key: train_accuracy
|
|
value: [0.8740458 0.88295165 0.86259542 0.8778626 0.88945362 0.88818297
|
|
0.87547649 0.88945362 0.89707751 0.86658196]
|
|
|
|
mean value: 0.8803681646087341
|
|
|
|
key: test_fscore
|
|
value: [0.84536082 0.80851064 0.76923077 0.76923077 0.86021505 0.83516484
|
|
0.82105263 0.82474227 0.80898876 0.83516484]
|
|
|
|
mean value: 0.8177661389259918
|
|
|
|
key: train_fscore
|
|
value: [0.88 0.8872549 0.86893204 0.88321168 0.89454545 0.89242054
|
|
0.88164251 0.89428919 0.90133983 0.87364621]
|
|
|
|
mean value: 0.8857282348915667
|
|
|
|
key: test_precision
|
|
value: [0.77358491 0.76 0.74468085 0.74468085 0.8 0.79166667
|
|
0.75 0.75471698 0.8 0.80851064]
|
|
|
|
mean value: 0.7727840893884651
|
|
|
|
key: train_precision
|
|
value: [0.84027778 0.85579196 0.83062645 0.84615385 0.85614849 0.86084906
|
|
0.84101382 0.85581395 0.86448598 0.82876712]
|
|
|
|
mean value: 0.8479928467674945
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.86363636 0.79545455 0.79545455 0.93023256 0.88372093
|
|
0.90697674 0.90909091 0.81818182 0.86363636]
|
|
|
|
mean value: 0.8698202959830866
|
|
|
|
key: train_recall
|
|
value: [0.92366412 0.92111959 0.91094148 0.92366412 0.93654822 0.92639594
|
|
0.92639594 0.93638677 0.94147583 0.92366412]
|
|
|
|
mean value: 0.9270256132057194
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.79545455 0.76136364 0.76136364 0.85147992 0.8282241
|
|
0.8057611 0.80338266 0.80443975 0.82716702]
|
|
|
|
mean value: 0.8068181818181818
|
|
|
|
key: train_roc_auc
|
|
value: [0.8740458 0.88295165 0.86259542 0.8778626 0.8893937 0.88813436
|
|
0.87541171 0.88951318 0.89713385 0.8666544 ]
|
|
|
|
mean value: 0.8803696671445732
|
|
|
|
key: test_jcc
|
|
value: [0.73214286 0.67857143 0.625 0.625 0.75471698 0.71698113
|
|
0.69642857 0.70175439 0.67924528 0.71698113]
|
|
|
|
mean value: 0.6926821771409656
|
|
|
|
key: train_jcc
|
|
value: [0.78571429 0.79735683 0.76824034 0.79084967 0.80921053 0.80573951
|
|
0.78833693 0.80879121 0.82039911 0.77564103]
|
|
|
|
mean value: 0.7950279451682578
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04097724 0.08146095 0.0438447 0.04152274 0.04924703 0.0503757
|
|
0.0411675 0.04127216 0.04899478 0.04264069]
|
|
|
|
mean value: 0.04815034866333008
|
|
|
|
key: score_time
|
|
value: [0.0130403 0.02584267 0.01314807 0.0123024 0.01325655 0.0131979
|
|
0.01327658 0.01870227 0.01221871 0.01481318]
|
|
|
|
mean value: 0.014979863166809082
|
|
|
|
key: test_mcc
|
|
value: [0.70014004 0.54772256 0.36363636 0.52394654 0.70254862 0.49497627
|
|
0.52973328 0.52312769 0.58615222 0.60920157]
|
|
|
|
mean value: 0.5581185160759019
|
|
|
|
key: train_mcc
|
|
value: [0.68354893 0.665364 0.68555338 0.68310469 0.64825655 0.66820753
|
|
0.65258177 0.66788902 0.66561315 0.66056435]
|
|
|
|
mean value: 0.6680683364521826
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.77272727 0.68181818 0.76136364 0.85057471 0.74712644
|
|
0.75862069 0.75862069 0.79310345 0.8045977 ]
|
|
|
|
mean value: 0.7769461859979101
|
|
|
|
key: train_accuracy
|
|
value: [0.84096692 0.83206107 0.84223919 0.84096692 0.82337992 0.83354511
|
|
0.82592122 0.83354511 0.83227446 0.82846252]
|
|
|
|
mean value: 0.8333362432143192
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.7826087 0.68181818 0.76923077 0.84337349 0.75
|
|
0.77894737 0.77894737 0.79545455 0.80898876]
|
|
|
|
mean value: 0.784651204416148
|
|
|
|
key: train_fscore
|
|
value: [0.84624846 0.83703704 0.84653465 0.84548826 0.82944785 0.83847102
|
|
0.83023544 0.83726708 0.83663366 0.83675937]
|
|
|
|
mean value: 0.8384122841516979
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.75 0.68181818 0.74468085 0.875 0.73333333
|
|
0.71153846 0.7254902 0.79545455 0.8 ]
|
|
|
|
mean value: 0.7595093347064561
|
|
|
|
key: train_precision
|
|
value: [0.81904762 0.81294964 0.82409639 0.82211538 0.80285036 0.81534772
|
|
0.81113801 0.81796117 0.81445783 0.79723502]
|
|
|
|
mean value: 0.8137199141553185
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.81818182 0.68181818 0.79545455 0.81395349 0.76744186
|
|
0.86046512 0.84090909 0.79545455 0.81818182]
|
|
|
|
mean value: 0.8146405919661733
|
|
|
|
key: train_recall
|
|
value: [0.87531807 0.86259542 0.87022901 0.87022901 0.85786802 0.86294416
|
|
0.85025381 0.85750636 0.86005089 0.88040712]
|
|
|
|
mean value: 0.8647401867710311
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.77272727 0.68181818 0.76136364 0.85015856 0.74735729
|
|
0.75977801 0.75766385 0.79307611 0.80443975]
|
|
|
|
mean value: 0.7769291754756871
|
|
|
|
key: train_roc_auc
|
|
value: [0.84096692 0.83206107 0.84223919 0.84096692 0.82333605 0.8335077
|
|
0.82589026 0.83357552 0.83230971 0.82852844]
|
|
|
|
mean value: 0.8333381769804058
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.64285714 0.51724138 0.625 0.72916667 0.6
|
|
0.63793103 0.63793103 0.66037736 0.67924528]
|
|
|
|
mean value: 0.6479749899309105
|
|
|
|
key: train_jcc
|
|
value: [0.73347548 0.71974522 0.73390558 0.73233405 0.70859539 0.72186837
|
|
0.70974576 0.72008547 0.71914894 0.71933472]
|
|
|
|
mean value: 0.7218238970505827
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.95160985 1.14174104 0.97420168 1.11319375 0.93467951 1.07230449
|
|
0.9751153 0.97105861 0.97270489 0.97742367]
|
|
|
|
mean value: 1.0084032773971559
|
|
|
|
key: score_time
|
|
value: [0.01518869 0.01511359 0.0150919 0.01260114 0.01492715 0.01479912
|
|
0.01499319 0.01532936 0.01512575 0.01502037]
|
|
|
|
mean value: 0.014819025993347168
|
|
|
|
key: test_mcc
|
|
value: [0.64236405 0.59152048 0.45501576 0.59152048 0.65696218 0.65696218
|
|
0.80389885 0.64236223 0.60940803 0.5641598 ]
|
|
|
|
mean value: 0.6214174050499405
|
|
|
|
key: train_mcc
|
|
value: [0.77708835 0.79694349 0.77893113 0.80712807 0.82524165 0.77226648
|
|
0.76473299 0.82318429 0.79021233 0.81771201]
|
|
|
|
mean value: 0.7953440786857605
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.79545455 0.72727273 0.79545455 0.82758621 0.82758621
|
|
0.89655172 0.81609195 0.8045977 0.7816092 ]
|
|
|
|
mean value: 0.8090386624869383
|
|
|
|
key: train_accuracy
|
|
value: [0.88804071 0.89821883 0.88931298 0.90330789 0.91232529 0.88564168
|
|
0.88182973 0.91105464 0.89453621 0.90851334]
|
|
|
|
mean value: 0.8972781296578303
|
|
|
|
key: test_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.82978723 0.8 0.72093023 0.8 0.83146067 0.83146067
|
|
0.90322581 0.83333333 0.8045977 0.79120879]
|
|
|
|
mean value: 0.8146004447058462
|
|
|
|
key: train_fscore
|
|
value: [0.89081886 0.9 0.89084065 0.905 0.91407223 0.88861386
|
|
0.88504326 0.91315136 0.89714994 0.91022444]
|
|
|
|
mean value: 0.8994914606531482
|
|
|
|
key: test_precision
|
|
value: [0.78 0.7826087 0.73809524 0.7826087 0.80434783 0.80434783
|
|
0.84 0.76923077 0.81395349 0.76595745]
|
|
|
|
mean value: 0.7881149985984872
|
|
|
|
key: train_precision
|
|
value: [0.86924939 0.88452088 0.87871287 0.88943489 0.89731051 0.86714976
|
|
0.8626506 0.89104116 0.87439614 0.89242054]
|
|
|
|
mean value: 0.8806886749617817
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.81818182 0.70454545 0.81818182 0.86046512 0.86046512
|
|
0.97674419 0.90909091 0.79545455 0.81818182]
|
|
|
|
mean value: 0.8447674418604652
|
|
|
|
key: train_recall
|
|
value: [0.91348601 0.91603053 0.90330789 0.92111959 0.93147208 0.91116751
|
|
0.90862944 0.93638677 0.92111959 0.92875318]
|
|
|
|
mean value: 0.9191472597873962
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.79545455 0.72727273 0.79545455 0.82795983 0.82795983
|
|
0.897463 0.81501057 0.80470402 0.78118393]
|
|
|
|
mean value: 0.8090644820295982
|
|
|
|
key: train_roc_auc
|
|
value: [0.88804071 0.89821883 0.88931298 0.90330789 0.91230093 0.8856092
|
|
0.88179564 0.91108679 0.89456995 0.90853903]
|
|
|
|
mean value: 0.89727819325506
|
|
|
|
key: test_jcc
|
|
value: [0.70909091 0.66666667 0.56363636 0.66666667 0.71153846 0.71153846
|
|
0.82352941 0.71428571 0.67307692 0.65454545]
|
|
|
|
mean value: 0.6894575032810327
|
|
|
|
key: train_jcc
|
|
value: [0.80313199 0.81818182 0.80316742 0.82648402 0.84174312 0.79955457
|
|
0.79379157 0.84018265 0.81348315 0.83524027]
|
|
|
|
mean value: 0.817496057662837
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01658487 0.01413941 0.01192164 0.01237583 0.01122928 0.01246905
|
|
0.01264453 0.01317739 0.01210928 0.01178169]
|
|
|
|
mean value: 0.01284329891204834
|
|
|
|
key: score_time
|
|
value: [0.01262975 0.00963998 0.00926065 0.00925183 0.00913525 0.00937343
|
|
0.00943851 0.00968313 0.00934291 0.00939965]
|
|
|
|
mean value: 0.009715509414672852
|
|
|
|
key: test_mcc
|
|
value: [0.32756921 0.4328254 0.32357511 0.56950711 0.44952813 0.49497627
|
|
0.40221987 0.19637409 0.33634906 0.28752643]
|
|
|
|
mean value: 0.38204506945222433
|
|
|
|
key: train_mcc
|
|
value: [0.38790491 0.46853479 0.45357422 0.43567904 0.46977459 0.46977459
|
|
0.43495033 0.44177962 0.4034274 0.47193424]
|
|
|
|
mean value: 0.4437333727941941
|
|
|
|
key: test_accuracy
|
|
value: [0.64772727 0.71590909 0.65909091 0.78409091 0.72413793 0.74712644
|
|
0.70114943 0.59770115 0.66666667 0.64367816]
|
|
|
|
mean value: 0.6887277951933124
|
|
|
|
key: train_accuracy
|
|
value: [0.68320611 0.73409669 0.7264631 0.71755725 0.73443456 0.73443456
|
|
0.71664549 0.72045743 0.70012706 0.73570521]
|
|
|
|
mean value: 0.7203127475419588
|
|
|
|
key: test_fscore
|
|
value: [0.71028037 0.72527473 0.6875 0.79120879 0.70731707 0.75
|
|
0.69767442 0.63157895 0.65060241 0.64367816]
|
|
|
|
mean value: 0.6995114900017191
|
|
|
|
key: train_fscore
|
|
value: [0.72786885 0.73907615 0.73358116 0.72456576 0.74292743 0.74292743
|
|
0.7290401 0.72839506 0.67934783 0.74129353]
|
|
|
|
mean value: 0.7289023304804851
|
|
|
|
key: test_precision
|
|
value: [0.6031746 0.70212766 0.63461538 0.76595745 0.74358974 0.73333333
|
|
0.69767442 0.58823529 0.69230769 0.65116279]
|
|
|
|
mean value: 0.6812178366823708
|
|
|
|
key: train_precision
|
|
value: [0.63793103 0.7254902 0.71497585 0.70702179 0.72076372 0.72076372
|
|
0.6993007 0.70743405 0.72886297 0.72506083]
|
|
|
|
mean value: 0.7087604867110122
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.75 0.75 0.81818182 0.6744186 0.76744186
|
|
0.69767442 0.68181818 0.61363636 0.63636364]
|
|
|
|
mean value: 0.7253171247357294
|
|
|
|
key: train_recall
|
|
value: [0.84732824 0.75318066 0.75318066 0.74300254 0.76649746 0.76649746
|
|
0.76142132 0.75063613 0.63613232 0.75826972]
|
|
|
|
mean value: 0.7536146523553041
|
|
|
|
key: test_roc_auc
|
|
value: [0.64772727 0.71590909 0.65909091 0.78409091 0.72357294 0.74735729
|
|
0.70110994 0.59672304 0.6672833 0.64376321]
|
|
|
|
mean value: 0.6886627906976744
|
|
|
|
key: train_roc_auc
|
|
value: [0.68320611 0.73409669 0.7264631 0.71755725 0.73439377 0.73439377
|
|
0.71658852 0.72049573 0.70004585 0.73573384]
|
|
|
|
mean value: 0.7202974645122124
|
|
|
|
key: test_jcc
|
|
value: [0.55072464 0.56896552 0.52380952 0.65454545 0.54716981 0.6
|
|
0.53571429 0.46153846 0.48214286 0.47457627]
|
|
|
|
mean value: 0.5399186820180317
|
|
|
|
key: train_jcc
|
|
value: [0.57216495 0.58613861 0.57925636 0.56809339 0.59099804 0.59099804
|
|
0.57361377 0.57281553 0.51440329 0.58893281]
|
|
|
|
mean value: 0.573741479292912
|
|
|
|
MCC on Blind test: 0.54
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01288152 0.01601839 0.01601601 0.01600742 0.01615286 0.01606011
|
|
0.01612782 0.01603866 0.01612043 0.01610899]
|
|
|
|
mean value: 0.01575322151184082
|
|
|
|
key: score_time
|
|
value: [0.01213813 0.01230121 0.01232433 0.01243854 0.01229501 0.01237464
|
|
0.01247144 0.01235986 0.0123229 0.01231599]
|
|
|
|
mean value: 0.012334203720092774
|
|
|
|
key: test_mcc
|
|
value: [0.50051733 0.36706517 0.22941573 0.52613536 0.51718675 0.38062515
|
|
0.40330006 0.33351176 0.58699109 0.24125255]
|
|
|
|
mean value: 0.4086000957908537
|
|
|
|
key: train_mcc
|
|
value: [0.44906143 0.4633579 0.46353821 0.49784849 0.45973957 0.47436493
|
|
0.44375086 0.48223144 0.45349856 0.49303545]
|
|
|
|
mean value: 0.46804268263856164
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.68181818 0.61363636 0.76136364 0.75862069 0.68965517
|
|
0.70114943 0.66666667 0.79310345 0.62068966]
|
|
|
|
mean value: 0.7036703239289446
|
|
|
|
key: train_accuracy
|
|
value: [0.72391858 0.73155216 0.73155216 0.74681934 0.72935197 0.73697586
|
|
0.72172808 0.7407878 0.72554003 0.74587039]
|
|
|
|
mean value: 0.7334096368791849
|
|
|
|
key: test_fscore
|
|
value: [0.75555556 0.70212766 0.63829787 0.77419355 0.75294118 0.69662921
|
|
0.68292683 0.68131868 0.79069767 0.63736264]
|
|
|
|
mean value: 0.7112050848179496
|
|
|
|
key: train_fscore
|
|
value: [0.73374233 0.7359199 0.73723537 0.76224612 0.73865031 0.74285714
|
|
0.72727273 0.74689826 0.73849879 0.75429975]
|
|
|
|
mean value: 0.7417620699172001
|
|
|
|
key: test_precision
|
|
value: [0.73913043 0.66 0.6 0.73469388 0.76190476 0.67391304
|
|
0.71794872 0.65957447 0.80952381 0.61702128]
|
|
|
|
mean value: 0.697371038987003
|
|
|
|
key: train_precision
|
|
value: [0.70853081 0.72413793 0.72195122 0.71846847 0.71496437 0.72749392
|
|
0.71393643 0.72881356 0.70438799 0.72921615]
|
|
|
|
mean value: 0.7191900844944616
|
|
|
|
key: test_recall
|
|
value: [0.77272727 0.75 0.68181818 0.81818182 0.74418605 0.72093023
|
|
0.65116279 0.70454545 0.77272727 0.65909091]
|
|
|
|
mean value: 0.727536997885835
|
|
|
|
key: train_recall
|
|
value: [0.76081425 0.7480916 0.75318066 0.81170483 0.76395939 0.75888325
|
|
0.74111675 0.76590331 0.77608142 0.78117048]
|
|
|
|
mean value: 0.766090595574844
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.68181818 0.61363636 0.76136364 0.75845666 0.69001057
|
|
0.7005814 0.66622622 0.79334038 0.62024313]
|
|
|
|
mean value: 0.7035676532769556
|
|
|
|
key: train_roc_auc
|
|
value: [0.72391858 0.73155216 0.73155216 0.74681934 0.72930794 0.73694799
|
|
0.72170341 0.74081967 0.72560416 0.74591519]
|
|
|
|
mean value: 0.7334140607845416
|
|
|
|
key: test_jcc
|
|
value: [0.60714286 0.54098361 0.46875 0.63157895 0.60377358 0.53448276
|
|
0.51851852 0.51666667 0.65384615 0.46774194]
|
|
|
|
mean value: 0.5543485029110216
|
|
|
|
key: train_jcc
|
|
value: [0.57945736 0.58217822 0.58382643 0.61583012 0.58560311 0.59090909
|
|
0.57142857 0.5960396 0.58541267 0.60552268]
|
|
|
|
mean value: 0.5896207857503801
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01600981 0.01106691 0.010391 0.01046634 0.01080656 0.01074886
|
|
0.01149368 0.01192141 0.01154351 0.01168633]
|
|
|
|
mean value: 0.01161344051361084
|
|
|
|
key: score_time
|
|
value: [0.0355916 0.01523995 0.01347685 0.01362085 0.01415038 0.01872969
|
|
0.0192039 0.01468134 0.01902127 0.01453424]
|
|
|
|
mean value: 0.017825007438659668
|
|
|
|
key: test_mcc
|
|
value: [0.38646346 0.50051733 0.18353259 0.52286233 0.49497627 0.36069346
|
|
0.37916452 0.31021744 0.42577098 0.33458714]
|
|
|
|
mean value: 0.38987855000800115
|
|
|
|
key: train_mcc
|
|
value: [0.58032736 0.60193114 0.55910844 0.57224844 0.59486336 0.60956553
|
|
0.62834476 0.57455631 0.59101716 0.5941861 ]
|
|
|
|
mean value: 0.5906148615612009
|
|
|
|
key: test_accuracy
|
|
value: [0.69318182 0.75 0.59090909 0.76136364 0.74712644 0.67816092
|
|
0.68965517 0.65517241 0.71264368 0.66666667]
|
|
|
|
mean value: 0.6944879832810867
|
|
|
|
key: train_accuracy
|
|
value: [0.78880407 0.80025445 0.77862595 0.78498728 0.79669632 0.8043202
|
|
0.81321474 0.78653113 0.79542567 0.79669632]
|
|
|
|
mean value: 0.7945556126754416
|
|
|
|
key: test_fscore
|
|
value: [0.69662921 0.75555556 0.56097561 0.76404494 0.75 0.69565217
|
|
0.68235294 0.66666667 0.72527473 0.68817204]
|
|
|
|
mean value: 0.6985323872656682
|
|
|
|
key: train_fscore
|
|
value: [0.79854369 0.80688807 0.78728606 0.79415347 0.80392157 0.80987654
|
|
0.82051282 0.79361179 0.79748428 0.80148883]
|
|
|
|
mean value: 0.801376712958553
|
|
|
|
key: test_precision
|
|
value: [0.68888889 0.73913043 0.60526316 0.75555556 0.73333333 0.65306122
|
|
0.69047619 0.65217391 0.70212766 0.65306122]
|
|
|
|
mean value: 0.6873071582528851
|
|
|
|
key: train_precision
|
|
value: [0.76334107 0.78095238 0.75764706 0.76168224 0.77725118 0.78846154
|
|
0.79058824 0.7672209 0.78855721 0.78208232]
|
|
|
|
mean value: 0.7757784149640108
|
|
|
|
key: test_recall
|
|
value: [0.70454545 0.77272727 0.52272727 0.77272727 0.76744186 0.74418605
|
|
0.6744186 0.68181818 0.75 0.72727273]
|
|
|
|
mean value: 0.7117864693446089
|
|
|
|
key: train_recall
|
|
value: [0.83715013 0.8346056 0.81933842 0.82951654 0.83248731 0.83248731
|
|
0.85279188 0.82188295 0.80661578 0.82188295]
|
|
|
|
mean value: 0.8288758863874143
|
|
|
|
key: test_roc_auc
|
|
value: [0.69318182 0.75 0.59090909 0.76136364 0.74735729 0.67891121
|
|
0.68948203 0.65486258 0.7122093 0.66596195]
|
|
|
|
mean value: 0.694423890063425
|
|
|
|
key: train_roc_auc
|
|
value: [0.78880407 0.80025445 0.77862595 0.78498728 0.79665078 0.80428437
|
|
0.81316439 0.78657599 0.79543987 0.79672828]
|
|
|
|
mean value: 0.7945515428630475
|
|
|
|
key: test_jcc
|
|
value: [0.53448276 0.60714286 0.38983051 0.61818182 0.6 0.53333333
|
|
0.51785714 0.5 0.56896552 0.52459016]
|
|
|
|
mean value: 0.5394384099786222
|
|
|
|
key: train_jcc
|
|
value: [0.66464646 0.67628866 0.64919355 0.65858586 0.67213115 0.68049793
|
|
0.69565217 0.65784114 0.66317992 0.66873706]
|
|
|
|
mean value: 0.6686753895067395
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05384135 0.05580044 0.0459094 0.0482986 0.04752588 0.04734731
|
|
0.0462234 0.04590893 0.04742646 0.04682469]
|
|
|
|
mean value: 0.04851064682006836
|
|
|
|
key: score_time
|
|
value: [0.01803613 0.01803875 0.01764989 0.01866722 0.01710939 0.01780176
|
|
0.01760387 0.01767254 0.01786971 0.0177207 ]
|
|
|
|
mean value: 0.017816996574401854
|
|
|
|
key: test_mcc
|
|
value: [0.62330229 0.50471461 0.45643546 0.57551157 0.65520898 0.54610162
|
|
0.53589621 0.48553084 0.59547841 0.50171077]
|
|
|
|
mean value: 0.5479890752759377
|
|
|
|
key: train_mcc
|
|
value: [0.65775744 0.67329502 0.64908799 0.64041864 0.62129367 0.65514187
|
|
0.64946023 0.6555617 0.65473654 0.64687618]
|
|
|
|
mean value: 0.6503629283381792
|
|
|
|
key: test_accuracy
|
|
value: [0.79545455 0.75 0.72727273 0.78409091 0.82758621 0.77011494
|
|
0.75862069 0.73563218 0.79310345 0.74712644]
|
|
|
|
mean value: 0.7689002089864159
|
|
|
|
key: train_accuracy
|
|
value: [0.82315522 0.8307888 0.82061069 0.81552163 0.80559085 0.82337992
|
|
0.82083863 0.82465057 0.82210928 0.81575604]
|
|
|
|
mean value: 0.820240162177367
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.76595745 0.73913043 0.8 0.82352941 0.7826087
|
|
0.78350515 0.76767677 0.8125 0.77083333]
|
|
|
|
mean value: 0.7869270656421982
|
|
|
|
key: train_fscore
|
|
value: [0.83818393 0.8451688 0.83353011 0.83001172 0.82188591 0.83666275
|
|
0.83392226 0.83571429 0.8364486 0.83352468]
|
|
|
|
mean value: 0.8345053058485761
|
|
|
|
key: test_precision
|
|
value: [0.72413793 0.72 0.70833333 0.74509804 0.83333333 0.73469388
|
|
0.7037037 0.69090909 0.75 0.71153846]
|
|
|
|
mean value: 0.7321747770619113
|
|
|
|
key: train_precision
|
|
value: [0.77253219 0.77896996 0.77753304 0.76956522 0.75913978 0.77899344
|
|
0.77802198 0.7852349 0.77321814 0.75941423]
|
|
|
|
mean value: 0.7732622869197299
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.81818182 0.77272727 0.86363636 0.81395349 0.8372093
|
|
0.88372093 0.86363636 0.88636364 0.84090909]
|
|
|
|
mean value: 0.8534883720930233
|
|
|
|
key: train_recall
|
|
value: [0.91603053 0.92366412 0.89821883 0.90076336 0.89593909 0.9035533
|
|
0.89847716 0.89312977 0.91094148 0.92366412]
|
|
|
|
mean value: 0.9064381756887666
|
|
|
|
key: test_roc_auc
|
|
value: [0.79545455 0.75 0.72727273 0.78409091 0.82743129 0.77087738
|
|
0.76004228 0.73414376 0.79201903 0.74603594]
|
|
|
|
mean value: 0.7687367864693446
|
|
|
|
key: train_roc_auc
|
|
value: [0.82315522 0.8307888 0.82061069 0.81552163 0.8054759 0.82327792
|
|
0.82073985 0.82473747 0.82222201 0.81589297]
|
|
|
|
mean value: 0.820242246935586
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.62068966 0.5862069 0.66666667 0.7 0.64285714
|
|
0.6440678 0.62295082 0.68421053 0.62711864]
|
|
|
|
mean value: 0.6494768147913834
|
|
|
|
key: train_jcc
|
|
value: [0.72144289 0.73185484 0.7145749 0.70941884 0.69762846 0.71919192
|
|
0.71515152 0.71779141 0.7188755 0.71456693]
|
|
|
|
mean value: 0.716049719596829
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.45261788 2.10335112 2.15535569 1.10358024 2.93123484 2.78560662
|
|
1.2532444 2.88533401 2.78543663 2.89050388]
|
|
|
|
mean value: 2.234626531600952
|
|
|
|
key: score_time
|
|
value: [0.01254225 0.01251721 0.01264668 0.01255536 0.01265812 0.01263404
|
|
0.01261973 0.0126524 0.01485896 0.01480293]
|
|
|
|
mean value: 0.013048768043518066
|
|
|
|
key: test_mcc
|
|
value: [0.62689067 0.61506768 0.48342972 0.62155249 0.67866682 0.64384947
|
|
0.61102358 0.63444041 0.61269937 0.58615222]
|
|
|
|
mean value: 0.6113772430907065
|
|
|
|
key: train_mcc
|
|
value: [0.86784408 0.90330789 0.89428856 0.77424761 0.92069685 0.91319934
|
|
0.79345585 0.9496176 0.91830829 0.94977755]
|
|
|
|
mean value: 0.8884743609283701
|
|
|
|
key: test_accuracy
|
|
value: [0.80681818 0.80681818 0.73863636 0.80681818 0.83908046 0.81609195
|
|
0.79310345 0.81609195 0.8045977 0.79310345]
|
|
|
|
mean value: 0.802115987460815
|
|
|
|
key: train_accuracy
|
|
value: [0.93256997 0.95165394 0.94656489 0.88549618 0.95933926 0.95552732
|
|
0.89326557 0.97458704 0.95806861 0.97458704]
|
|
|
|
mean value: 0.9431659828446349
|
|
|
|
key: test_fscore
|
|
value: [0.82474227 0.81318681 0.71604938 0.82105263 0.83333333 0.82978723
|
|
0.81632653 0.82608696 0.8172043 0.79545455]
|
|
|
|
mean value: 0.8093223996562732
|
|
|
|
key: train_fscore
|
|
value: [0.93512852 0.95165394 0.94516971 0.89051095 0.95800525 0.95705521
|
|
0.9 0.97493734 0.95940959 0.975 ]
|
|
|
|
mean value: 0.9446870526213144
|
|
|
|
key: test_precision
|
|
value: [0.75471698 0.78723404 0.78378378 0.76470588 0.85365854 0.76470588
|
|
0.72727273 0.79166667 0.7755102 0.79545455]
|
|
|
|
mean value: 0.7798709252235871
|
|
|
|
key: train_precision
|
|
value: [0.9009434 0.95165394 0.97050938 0.85314685 0.99184783 0.9263658
|
|
0.84753363 0.96049383 0.92857143 0.95823096]
|
|
|
|
mean value: 0.9289297044832938
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.84090909 0.65909091 0.88636364 0.81395349 0.90697674
|
|
0.93023256 0.86363636 0.86363636 0.79545455]
|
|
|
|
mean value: 0.8469344608879492
|
|
|
|
key: train_recall
|
|
value: [0.97201018 0.95165394 0.92111959 0.93129771 0.92639594 0.98984772
|
|
0.95939086 0.98982188 0.99236641 0.99236641]
|
|
|
|
mean value: 0.9626270650082019
|
|
|
|
key: test_roc_auc
|
|
value: [0.80681818 0.80681818 0.73863636 0.80681818 0.83879493 0.81712474
|
|
0.79466173 0.81553911 0.80391121 0.79307611]
|
|
|
|
mean value: 0.8022198731501057
|
|
|
|
key: train_roc_auc
|
|
value: [0.93256997 0.95165394 0.94656489 0.88549618 0.95938118 0.95548365
|
|
0.89318144 0.97460637 0.95811214 0.9746096 ]
|
|
|
|
mean value: 0.9431659368905078
|
|
|
|
key: test_jcc
|
|
value: [0.70175439 0.68518519 0.55769231 0.69642857 0.71428571 0.70909091
|
|
0.68965517 0.7037037 0.69090909 0.66037736]
|
|
|
|
mean value: 0.6809082399164754
|
|
|
|
key: train_jcc
|
|
value: [0.87816092 0.90776699 0.8960396 0.80263158 0.91939547 0.91764706
|
|
0.81818182 0.95110024 0.92198582 0.95121951]
|
|
|
|
mean value: 0.8964129008036302
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0604918 0.04539132 0.05021358 0.04280376 0.04391217 0.04870415
|
|
0.05093575 0.04413128 0.04482388 0.04480577]
|
|
|
|
mean value: 0.04762134552001953
|
|
|
|
key: score_time
|
|
value: [0.00954056 0.00907254 0.00901723 0.00905633 0.00914407 0.00906348
|
|
0.00919557 0.00921273 0.00918293 0.00919628]
|
|
|
|
mean value: 0.009168171882629394
|
|
|
|
key: test_mcc
|
|
value: [0.84287052 0.75174939 0.64236405 0.81902836 0.72746922 0.81606765
|
|
0.74735729 0.7472238 0.65539112 0.65905141]
|
|
|
|
mean value: 0.7408572818519341
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92045455 0.875 0.81818182 0.90909091 0.86206897 0.90804598
|
|
0.87356322 0.87356322 0.82758621 0.82758621]
|
|
|
|
mean value: 0.869514106583072
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.87912088 0.82978723 0.91111111 0.86666667 0.90697674
|
|
0.87356322 0.87640449 0.82758621 0.83870968]
|
|
|
|
mean value: 0.8733003155292913
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89361702 0.85106383 0.78 0.89130435 0.82978723 0.90697674
|
|
0.86363636 0.86666667 0.8372093 0.79591837]
|
|
|
|
mean value: 0.8516179877094067
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.90909091 0.88636364 0.93181818 0.90697674 0.90697674
|
|
0.88372093 0.88636364 0.81818182 0.88636364]
|
|
|
|
mean value: 0.8970401691331924
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92045455 0.875 0.81818182 0.90909091 0.86257928 0.90803383
|
|
0.87367865 0.87341438 0.82769556 0.82690275]
|
|
|
|
mean value: 0.8695031712473573
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.78431373 0.70909091 0.83673469 0.76470588 0.82978723
|
|
0.7755102 0.78 0.70588235 0.72222222]
|
|
|
|
mean value: 0.7765390081242038
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16874266 0.17347503 0.1705687 0.17209721 0.17211318 0.17244864
|
|
0.17359471 0.17075062 0.17084527 0.17014956]
|
|
|
|
mean value: 0.1714785575866699
|
|
|
|
key: score_time
|
|
value: [0.01893568 0.02032566 0.01877761 0.01896501 0.02001143 0.01879668
|
|
0.01911426 0.01880836 0.01880097 0.0191679 ]
|
|
|
|
mean value: 0.019170355796813966
|
|
|
|
key: test_mcc
|
|
value: [0.77352678 0.54601891 0.65926119 0.63702206 0.67900591 0.68515773
|
|
0.54295079 0.52312769 0.70301836 0.63213531]
|
|
|
|
mean value: 0.6381224729589329
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88636364 0.77272727 0.82954545 0.81818182 0.83908046 0.83908046
|
|
0.77011494 0.75862069 0.85057471 0.81609195]
|
|
|
|
mean value: 0.8180381400208986
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.77777778 0.82758621 0.82222222 0.84090909 0.84782609
|
|
0.77777778 0.77894737 0.84705882 0.81818182]
|
|
|
|
mean value: 0.8227176061561114
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86956522 0.76086957 0.8372093 0.80434783 0.82222222 0.79591837
|
|
0.74468085 0.7254902 0.87804878 0.81818182]
|
|
|
|
mean value: 0.8056534146402279
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.79545455 0.81818182 0.84090909 0.86046512 0.90697674
|
|
0.81395349 0.84090909 0.81818182 0.81818182]
|
|
|
|
mean value: 0.84223044397463
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.88636364 0.77272727 0.82954545 0.81818182 0.83932347 0.83985201
|
|
0.77061311 0.75766385 0.85095137 0.81606765]
|
|
|
|
mean value: 0.8181289640591967
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.63636364 0.70588235 0.69811321 0.7254902 0.73584906
|
|
0.63636364 0.63793103 0.73469388 0.69230769]
|
|
|
|
mean value: 0.7002994690239295
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01320601 0.01333499 0.01326704 0.01336837 0.0133636 0.01205587
|
|
0.01195025 0.01203203 0.01198006 0.01238704]
|
|
|
|
mean value: 0.012694525718688964
|
|
|
|
key: score_time
|
|
value: [0.0099504 0.00991678 0.00989676 0.00985718 0.0095439 0.00915027
|
|
0.00912404 0.00921392 0.00910091 0.00892544]
|
|
|
|
mean value: 0.0094679594039917
|
|
|
|
key: test_mcc
|
|
value: [0.50471461 0.43738879 0.29553088 0.61379491 0.3853797 0.5504913
|
|
0.33456898 0.42976952 0.40221987 0.4957562 ]
|
|
|
|
mean value: 0.44496147605893255
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.71590909 0.64772727 0.80681818 0.68965517 0.77011494
|
|
0.66666667 0.71264368 0.70114943 0.74712644]
|
|
|
|
mean value: 0.7207810867293626
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76595745 0.73684211 0.65168539 0.8045977 0.70967742 0.78723404
|
|
0.6741573 0.73684211 0.70454545 0.76086957]
|
|
|
|
mean value: 0.7332408536784342
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72 0.68627451 0.64444444 0.81395349 0.66 0.7254902
|
|
0.65217391 0.68627451 0.70454545 0.72916667]
|
|
|
|
mean value: 0.7022323182758412
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81818182 0.79545455 0.65909091 0.79545455 0.76744186 0.86046512
|
|
0.69767442 0.79545455 0.70454545 0.79545455]
|
|
|
|
mean value: 0.7689217758985201
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.71590909 0.64772727 0.80681818 0.69053911 0.77114165
|
|
0.66701903 0.71168076 0.70110994 0.74656448]
|
|
|
|
mean value: 0.7208509513742072
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.62068966 0.58333333 0.48333333 0.67307692 0.55 0.64912281
|
|
0.50847458 0.58333333 0.54385965 0.61403509]
|
|
|
|
mean value: 0.5809258698380173
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.75696111 2.67333555 2.62944531 2.70618653 2.58711076 2.66426349
|
|
2.66188812 2.57534146 2.5954206 2.59278345]
|
|
|
|
mean value: 2.6442736387252808
|
|
|
|
key: score_time
|
|
value: [0.10721159 0.10074186 0.10092258 0.10331321 0.09919405 0.10507584
|
|
0.09974909 0.10483932 0.09813857 0.0980525 ]
|
|
|
|
mean value: 0.10172386169433593
|
|
|
|
key: test_mcc
|
|
value: [0.86452993 0.82589664 0.75019377 0.75174939 0.77786181 0.84118687
|
|
0.85040097 0.79810753 0.79334038 0.86289151]
|
|
|
|
mean value: 0.8116158816503178
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.90909091 0.875 0.875 0.88505747 0.91954023
|
|
0.91954023 0.89655172 0.89655172 0.93103448]
|
|
|
|
mean value: 0.9039184952978057
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.91489362 0.87640449 0.87912088 0.89130435 0.92134831
|
|
0.92473118 0.90322581 0.89655172 0.93333333]
|
|
|
|
mean value: 0.9074247033008916
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.91304348 0.86 0.86666667 0.85106383 0.83673469 0.89130435
|
|
0.86 0.85714286 0.90697674 0.91304348]
|
|
|
|
mean value: 0.8755976096008181
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.97727273 0.88636364 0.90909091 0.95348837 0.95348837
|
|
1. 0.95454545 0.88636364 0.95454545]
|
|
|
|
mean value: 0.942970401691332
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93181818 0.90909091 0.875 0.875 0.8858351 0.919926
|
|
0.92045455 0.89587738 0.89667019 0.9307611 ]
|
|
|
|
mean value: 0.9040433403805497
|
|
|
|
key: train_roc_auc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.84313725 0.78 0.78431373 0.80392157 0.85416667
|
|
0.86 0.82352941 0.8125 0.875 ]
|
|
|
|
mean value: 0.831156862745098
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.64
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.1576159 1.13111639 1.09706926 1.06529927 1.07890487 1.10068297
|
|
1.08443117 1.10074687 1.09086037 1.1387794 ]
|
|
|
|
mean value: 1.1045506477355957
|
|
|
|
key: score_time
|
|
value: [0.27794409 0.15603256 0.28928804 0.25447321 0.27145886 0.28354359
|
|
0.27408671 0.28654885 0.28274965 0.26291871]
|
|
|
|
mean value: 0.26390442848205564
|
|
|
|
key: test_mcc
|
|
value: [0.88843109 0.8057162 0.72802521 0.81902836 0.70301836 0.77008457
|
|
0.84485784 0.74867823 0.81702814 0.86289151]
|
|
|
|
mean value: 0.798775949639167
|
|
|
|
key: train_mcc
|
|
value: [0.90651296 0.91391215 0.90099965 0.90125658 0.91171289 0.901238
|
|
0.90632277 0.92681126 0.90870336 0.9217261 ]
|
|
|
|
mean value: 0.9099195719638971
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.89772727 0.86363636 0.90909091 0.85057471 0.88505747
|
|
0.91954023 0.87356322 0.90804598 0.93103448]
|
|
|
|
mean value: 0.8981452455590386
|
|
|
|
key: train_accuracy
|
|
value: [0.95292621 0.956743 0.95038168 0.95038168 0.95552732 0.95044473
|
|
0.95298602 0.96315121 0.95425667 0.96060991]
|
|
|
|
mean value: 0.9547408427661975
|
|
|
|
key: test_fscore
|
|
value: [0.94505495 0.90526316 0.86666667 0.91111111 0.85393258 0.88372093
|
|
0.92307692 0.87912088 0.90697674 0.93333333]
|
|
|
|
mean value: 0.9008257274946863
|
|
|
|
key: train_fscore
|
|
value: [0.95380774 0.95739348 0.9509434 0.95118899 0.95641345 0.95118899
|
|
0.95369212 0.96370463 0.95465995 0.9612015 ]
|
|
|
|
mean value: 0.9554194239721927
|
|
|
|
key: test_precision
|
|
value: [0.91489362 0.84313725 0.84782609 0.89130435 0.82608696 0.88372093
|
|
0.875 0.85106383 0.92857143 0.91304348]
|
|
|
|
mean value: 0.8774647930079675
|
|
|
|
key: train_precision
|
|
value: [0.93627451 0.94320988 0.94029851 0.93596059 0.93887531 0.9382716
|
|
0.94074074 0.94827586 0.94513716 0.94581281]
|
|
|
|
mean value: 0.9412856963303278
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.97727273 0.88636364 0.93181818 0.88372093 0.88372093
|
|
0.97674419 0.90909091 0.88636364 0.95454545]
|
|
|
|
mean value: 0.9266913319238901
|
|
|
|
key: train_recall
|
|
value: [0.97201018 0.97201018 0.96183206 0.96692112 0.97461929 0.96446701
|
|
0.96700508 0.97964377 0.96437659 0.97709924]
|
|
|
|
mean value: 0.9699984500329368
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.89772727 0.86363636 0.90909091 0.85095137 0.88504228
|
|
0.92019027 0.87315011 0.9082981 0.9307611 ]
|
|
|
|
mean value: 0.8982029598308667
|
|
|
|
key: train_roc_auc
|
|
value: [0.95292621 0.956743 0.95038168 0.95038168 0.95550303 0.95042689
|
|
0.95296819 0.96317214 0.95426951 0.96063084]
|
|
|
|
mean value: 0.954740315934953
|
|
|
|
key: test_jcc
|
|
value: [0.89583333 0.82692308 0.76470588 0.83673469 0.74509804 0.79166667
|
|
0.85714286 0.78431373 0.82978723 0.875 ]
|
|
|
|
mean value: 0.8207205509044861
|
|
|
|
key: train_jcc
|
|
value: [0.91169451 0.91826923 0.90647482 0.90692124 0.91646778 0.90692124
|
|
0.91148325 0.92995169 0.91325301 0.9253012 ]
|
|
|
|
mean value: 0.9146737985460048
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0279119 0.01599693 0.01595879 0.0159502 0.01609945 0.0159781
|
|
0.02555108 0.01599431 0.01797676 0.01640034]
|
|
|
|
mean value: 0.018381786346435548
|
|
|
|
key: score_time
|
|
value: [0.01233649 0.01231313 0.01254392 0.01234722 0.0123775 0.01232219
|
|
0.01229525 0.01237607 0.01264095 0.01244164]
|
|
|
|
mean value: 0.012399435043334961
|
|
|
|
key: test_mcc
|
|
value: [0.50051733 0.36706517 0.22941573 0.52613536 0.51718675 0.38062515
|
|
0.40330006 0.33351176 0.58699109 0.24125255]
|
|
|
|
mean value: 0.4086000957908537
|
|
|
|
key: train_mcc
|
|
value: [0.44906143 0.4633579 0.46353821 0.49784849 0.45973957 0.47436493
|
|
0.44375086 0.48223144 0.45349856 0.49303545]
|
|
|
|
mean value: 0.46804268263856164
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.68181818 0.61363636 0.76136364 0.75862069 0.68965517
|
|
0.70114943 0.66666667 0.79310345 0.62068966]
|
|
|
|
mean value: 0.7036703239289446
|
|
|
|
key: train_accuracy
|
|
value: [0.72391858 0.73155216 0.73155216 0.74681934 0.72935197 0.73697586
|
|
0.72172808 0.7407878 0.72554003 0.74587039]
|
|
|
|
mean value: 0.7334096368791849
|
|
|
|
key: test_fscore
|
|
value: [0.75555556 0.70212766 0.63829787 0.77419355 0.75294118 0.69662921
|
|
0.68292683 0.68131868 0.79069767 0.63736264]
|
|
|
|
mean value: 0.7112050848179496
|
|
|
|
key: train_fscore
|
|
value: [0.73374233 0.7359199 0.73723537 0.76224612 0.73865031 0.74285714
|
|
0.72727273 0.74689826 0.73849879 0.75429975]
|
|
|
|
mean value: 0.7417620699172001
|
|
|
|
key: test_precision
|
|
value: [0.73913043 0.66 0.6 0.73469388 0.76190476 0.67391304
|
|
0.71794872 0.65957447 0.80952381 0.61702128]
|
|
|
|
mean value: 0.697371038987003
|
|
|
|
key: train_precision
|
|
value: [0.70853081 0.72413793 0.72195122 0.71846847 0.71496437 0.72749392
|
|
0.71393643 0.72881356 0.70438799 0.72921615]
|
|
|
|
mean value: 0.7191900844944616
|
|
|
|
key: test_recall
|
|
value: [0.77272727 0.75 0.68181818 0.81818182 0.74418605 0.72093023
|
|
0.65116279 0.70454545 0.77272727 0.65909091]
|
|
|
|
mean value: 0.727536997885835
|
|
|
|
key: train_recall
|
|
value: [0.76081425 0.7480916 0.75318066 0.81170483 0.76395939 0.75888325
|
|
0.74111675 0.76590331 0.77608142 0.78117048]
|
|
|
|
mean value: 0.766090595574844
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.68181818 0.61363636 0.76136364 0.75845666 0.69001057
|
|
0.7005814 0.66622622 0.79334038 0.62024313]
|
|
|
|
mean value: 0.7035676532769556
|
|
|
|
key: train_roc_auc
|
|
value: [0.72391858 0.73155216 0.73155216 0.74681934 0.72930794 0.73694799
|
|
0.72170341 0.74081967 0.72560416 0.74591519]
|
|
|
|
mean value: 0.7334140607845416
|
|
|
|
key: test_jcc
|
|
value: [0.60714286 0.54098361 0.46875 0.63157895 0.60377358 0.53448276
|
|
0.51851852 0.51666667 0.65384615 0.46774194]
|
|
|
|
mean value: 0.5543485029110216
|
|
|
|
key: train_jcc
|
|
value: [0.57945736 0.58217822 0.58382643 0.61583012 0.58560311 0.59090909
|
|
0.57142857 0.5960396 0.58541267 0.60552268]
|
|
|
|
mean value: 0.5896207857503801
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.16803455 0.12479401 0.13190556 0.13270855 0.13482141 0.12760592
|
|
0.12809372 0.13602591 0.12585688 0.12302065]
|
|
|
|
mean value: 0.133286714553833
|
|
|
|
key: score_time
|
|
value: [0.01130319 0.01135063 0.01241302 0.01135039 0.01175189 0.01135063
|
|
0.01132727 0.01134205 0.01131344 0.01137137]
|
|
|
|
mean value: 0.011487388610839843
|
|
|
|
key: test_mcc
|
|
value: [0.88659264 0.77594029 0.79566006 0.84287052 0.77008457 0.86205074
|
|
0.86585804 0.79323121 0.84118687 0.81683533]
|
|
|
|
mean value: 0.8250310278177166
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.88636364 0.89772727 0.92045455 0.88505747 0.93103448
|
|
0.93103448 0.89655172 0.91954023 0.90804598]
|
|
|
|
mean value: 0.9118991640543365
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94252874 0.89130435 0.89655172 0.92307692 0.88372093 0.93023256
|
|
0.93333333 0.8988764 0.91764706 0.91111111]
|
|
|
|
mean value: 0.9128383126807573
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95348837 0.85416667 0.90697674 0.89361702 0.88372093 0.93023256
|
|
0.89361702 0.88888889 0.95121951 0.89130435]
|
|
|
|
mean value: 0.9047232062781119
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.93181818 0.88636364 0.95454545 0.88372093 0.93023256
|
|
0.97674419 0.90909091 0.88636364 0.93181818]
|
|
|
|
mean value: 0.9222515856236786
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.88636364 0.89772727 0.92045455 0.88504228 0.93102537
|
|
0.93155391 0.89640592 0.919926 0.90776956]
|
|
|
|
mean value: 0.9119450317124735
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89130435 0.80392157 0.8125 0.85714286 0.79166667 0.86956522
|
|
0.875 0.81632653 0.84782609 0.83673469]
|
|
|
|
mean value: 0.8401987969100684
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04277778 0.06277061 0.09687138 0.06872678 0.07880187 0.06946063
|
|
0.08058167 0.06243658 0.07856321 0.05238271]
|
|
|
|
mean value: 0.06933732032775879
|
|
|
|
key: score_time
|
|
value: [0.01239014 0.01243758 0.02226973 0.01272941 0.01268435 0.01261806
|
|
0.01257753 0.01916575 0.01260209 0.01247287]
|
|
|
|
mean value: 0.014194750785827636
|
|
|
|
key: test_mcc
|
|
value: [0.64715023 0.59648091 0.38726484 0.52613536 0.65994555 0.61371748
|
|
0.70637613 0.66885041 0.51718675 0.54016913]
|
|
|
|
mean value: 0.586327679755505
|
|
|
|
key: train_mcc
|
|
value: [0.76304068 0.7600656 0.77117136 0.76814463 0.75793471 0.75767069
|
|
0.74972171 0.77835845 0.76606319 0.76083987]
|
|
|
|
mean value: 0.7633010876409456
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.79545455 0.69318182 0.76136364 0.82758621 0.8045977
|
|
0.83908046 0.82758621 0.75862069 0.77011494]
|
|
|
|
mean value: 0.789576802507837
|
|
|
|
key: train_accuracy
|
|
value: [0.88040712 0.87913486 0.88422392 0.88295165 0.87801779 0.87801779
|
|
0.8729352 0.88818297 0.88182973 0.87801779]
|
|
|
|
mean value: 0.8803718827899939
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.80851064 0.7032967 0.77419355 0.83516484 0.81318681
|
|
0.85714286 0.84536082 0.76404494 0.77272727]
|
|
|
|
mean value: 0.8006961770099277
|
|
|
|
key: train_fscore
|
|
value: [0.88480392 0.88314883 0.88888889 0.8872549 0.88235294 0.88206388
|
|
0.87922705 0.89189189 0.88616891 0.88433735]
|
|
|
|
mean value: 0.8850138572225262
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.76 0.68085106 0.73469388 0.79166667 0.77083333
|
|
0.76363636 0.77358491 0.75555556 0.77272727]
|
|
|
|
mean value: 0.7572779808191146
|
|
|
|
key: train_precision
|
|
value: [0.8534279 0.8547619 0.85446009 0.85579196 0.85308057 0.8547619
|
|
0.83870968 0.86223278 0.85377358 0.83981693]
|
|
|
|
mean value: 0.8520817305357777
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.86363636 0.72727273 0.81818182 0.88372093 0.86046512
|
|
0.97674419 0.93181818 0.77272727 0.77272727]
|
|
|
|
mean value: 0.8516384778012684
|
|
|
|
key: train_recall
|
|
value: [0.91857506 0.91348601 0.92620865 0.92111959 0.91370558 0.91116751
|
|
0.92385787 0.92366412 0.92111959 0.93384224]
|
|
|
|
mean value: 0.9206746231642577
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.79545455 0.69318182 0.76136364 0.8282241 0.80523256
|
|
0.84064482 0.82637421 0.75845666 0.77008457]
|
|
|
|
mean value: 0.7897198731501057
|
|
|
|
key: train_roc_auc
|
|
value: [0.88040712 0.87913486 0.88422392 0.88295165 0.87797238 0.87797561
|
|
0.87287041 0.888228 0.88187959 0.87808863]
|
|
|
|
mean value: 0.8803732191524264
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.67857143 0.54237288 0.63157895 0.71698113 0.68518519
|
|
0.75 0.73214286 0.61818182 0.62962963]
|
|
|
|
mean value: 0.6698929593796458
|
|
|
|
key: train_jcc
|
|
value: [0.79340659 0.7907489 0.8 0.79735683 0.78947368 0.78901099
|
|
0.78448276 0.80487805 0.7956044 0.79265659]
|
|
|
|
mean value: 0.793761878397893
|
|
|
|
MCC on Blind test: 0.51
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01543641 0.01584482 0.01554322 0.01582074 0.01554036 0.01604772
|
|
0.01601362 0.01568246 0.01603746 0.0155468 ]
|
|
|
|
mean value: 0.015751361846923828
|
|
|
|
key: score_time
|
|
value: [0.01288176 0.01244569 0.01278973 0.01243854 0.01280117 0.01244664
|
|
0.01247144 0.0124588 0.0125134 0.01243162]
|
|
|
|
mean value: 0.012567877769470215
|
|
|
|
key: test_mcc
|
|
value: [0.51970115 0.4328254 0.38726484 0.60092521 0.5404983 0.49418605
|
|
0.35843235 0.28973226 0.51803019 0.33641135]
|
|
|
|
mean value: 0.4478007105008516
|
|
|
|
key: train_mcc
|
|
value: [0.4672002 0.45784843 0.47741223 0.48485612 0.45579637 0.4606799
|
|
0.47622996 0.47323703 0.46038218 0.47323703]
|
|
|
|
mean value: 0.46868794567069183
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.71590909 0.69318182 0.79545455 0.77011494 0.74712644
|
|
0.67816092 0.64367816 0.75862069 0.66666667]
|
|
|
|
mean value: 0.7218913270637408
|
|
|
|
key: train_accuracy
|
|
value: [0.73282443 0.72773537 0.73791349 0.74045802 0.72681067 0.72935197
|
|
0.73697586 0.73570521 0.72935197 0.73570521]
|
|
|
|
mean value: 0.7332832187163545
|
|
|
|
key: test_fscore
|
|
value: [0.78 0.72527473 0.7032967 0.8125 0.76190476 0.74418605
|
|
0.68888889 0.67368421 0.76923077 0.69473684]
|
|
|
|
mean value: 0.7353702947739056
|
|
|
|
key: train_fscore
|
|
value: [0.74327628 0.7409201 0.74816626 0.75598086 0.74002418 0.74181818
|
|
0.7496977 0.74634146 0.73992674 0.74634146]
|
|
|
|
mean value: 0.7452493235793951
|
|
|
|
key: test_precision
|
|
value: [0.69642857 0.70212766 0.68085106 0.75 0.7804878 0.74418605
|
|
0.65957447 0.62745098 0.74468085 0.64705882]
|
|
|
|
mean value: 0.7032846269293008
|
|
|
|
key: train_precision
|
|
value: [0.71529412 0.70669746 0.72 0.71331828 0.70669746 0.7099768
|
|
0.71593533 0.71662763 0.71126761 0.71662763]
|
|
|
|
mean value: 0.7132442329211506
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.75 0.72727273 0.88636364 0.74418605 0.74418605
|
|
0.72093023 0.72727273 0.79545455 0.75 ]
|
|
|
|
mean value: 0.7732029598308668
|
|
|
|
key: train_recall
|
|
value: [0.7735369 0.77862595 0.77862595 0.80407125 0.77664975 0.77664975
|
|
0.78680203 0.77862595 0.77099237 0.77862595]
|
|
|
|
mean value: 0.7803205848542385
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.71590909 0.69318182 0.79545455 0.7698203 0.74709302
|
|
0.67864693 0.64270613 0.75819239 0.66569767]
|
|
|
|
mean value: 0.7216701902748415
|
|
|
|
key: train_roc_auc
|
|
value: [0.73282443 0.72773537 0.73791349 0.74045802 0.72674726 0.72929179
|
|
0.73691247 0.73575968 0.72940481 0.73575968]
|
|
|
|
mean value: 0.7332806990351455
|
|
|
|
key: test_jcc
|
|
value: [0.63934426 0.56896552 0.54237288 0.68421053 0.61538462 0.59259259
|
|
0.52542373 0.50793651 0.625 0.53225806]
|
|
|
|
mean value: 0.5833488696451588
|
|
|
|
key: train_jcc
|
|
value: [0.59143969 0.58846154 0.59765625 0.60769231 0.58733205 0.58959538
|
|
0.59961315 0.59533074 0.5872093 0.59533074]
|
|
|
|
mean value: 0.593966114806459
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03588009 0.02543688 0.0264957 0.03651309 0.03167486 0.02774239
|
|
0.02617311 0.03044653 0.02691627 0.02822828]
|
|
|
|
mean value: 0.029550719261169433
|
|
|
|
key: score_time
|
|
value: [0.01276159 0.01245403 0.01267171 0.01251268 0.01240969 0.01239777
|
|
0.01264715 0.01272607 0.012429 0.01247954]
|
|
|
|
mean value: 0.01254892349243164
|
|
|
|
key: test_mcc
|
|
value: [0.54772256 0.3796283 0.26490647 0.61419227 0.65696218 0.49974958
|
|
0.56342495 0.35625628 0.46314724 0.49682118]
|
|
|
|
mean value: 0.48428110037982225
|
|
|
|
key: train_mcc
|
|
value: [0.68816837 0.47401498 0.41612519 0.65815286 0.72539042 0.678299
|
|
0.62574484 0.35995489 0.53423919 0.66801919]
|
|
|
|
mean value: 0.582810893164612
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.65909091 0.59090909 0.79545455 0.82758621 0.74712644
|
|
0.7816092 0.6091954 0.70114943 0.74712644]
|
|
|
|
mean value: 0.7231974921630094
|
|
|
|
key: train_accuracy
|
|
value: [0.83842239 0.69720102 0.65267176 0.81170483 0.86022872 0.8360864
|
|
0.81194409 0.61880559 0.73189327 0.83354511]
|
|
|
|
mean value: 0.7692503176620076
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.53125 0.35714286 0.82 0.83146067 0.76086957
|
|
0.7816092 0.37037037 0.76363636 0.73809524]
|
|
|
|
mean value: 0.6716339025926584
|
|
|
|
key: train_fscore
|
|
value: [0.82237762 0.58098592 0.47398844 0.8377193 0.86810552 0.84661118
|
|
0.80474934 0.3877551 0.78491335 0.82875817]
|
|
|
|
mean value: 0.7235963934245662
|
|
|
|
key: test_precision
|
|
value: [0.8 0.85 0.83333333 0.73214286 0.80434783 0.71428571
|
|
0.77272727 1. 0.63636364 0.775 ]
|
|
|
|
mean value: 0.7918200639939771
|
|
|
|
key: train_precision
|
|
value: [0.91304348 0.94285714 0.97619048 0.73603083 0.82272727 0.79642058
|
|
0.83791209 0.97938144 0.6547619 0.85215054]
|
|
|
|
mean value: 0.851147575381499
|
|
|
|
key: test_recall
|
|
value: [0.72727273 0.38636364 0.22727273 0.93181818 0.86046512 0.81395349
|
|
0.79069767 0.22727273 0.95454545 0.70454545]
|
|
|
|
mean value: 0.6624207188160677
|
|
|
|
key: train_recall
|
|
value: [0.7480916 0.41984733 0.3129771 0.97201018 0.91878173 0.9035533
|
|
0.77411168 0.24173028 0.97964377 0.80661578]
|
|
|
|
mean value: 0.7077362731041965
|
|
|
|
key: test_roc_auc
|
|
value: [0.77272727 0.65909091 0.59090909 0.79545455 0.82795983 0.74788584
|
|
0.78171247 0.61363636 0.69820296 0.74762156]
|
|
|
|
mean value: 0.7235200845665962
|
|
|
|
key: train_roc_auc
|
|
value: [0.83842239 0.69720102 0.65267176 0.81170483 0.86015422 0.83600057
|
|
0.81199222 0.61832707 0.73220767 0.83351093]
|
|
|
|
mean value: 0.769219268673874
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.36170213 0.2173913 0.69491525 0.71153846 0.61403509
|
|
0.64150943 0.22727273 0.61764706 0.58490566]
|
|
|
|
mean value: 0.5286301731322943
|
|
|
|
key: train_jcc
|
|
value: [0.69833729 0.40942928 0.31060606 0.72075472 0.76694915 0.73402062
|
|
0.67328918 0.24050633 0.64597315 0.70758929]
|
|
|
|
mean value: 0.5907455073658393
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03075266 0.03120208 0.03316736 0.03134346 0.03602695 0.03724575
|
|
0.04394269 0.03862977 0.03296971 0.04053521]
|
|
|
|
mean value: 0.035581564903259276
|
|
|
|
key: score_time
|
|
value: [0.01244569 0.0128181 0.01300859 0.012604 0.0123992 0.01992011
|
|
0.01752043 0.01252294 0.01711726 0.01253414]
|
|
|
|
mean value: 0.014289045333862304
|
|
|
|
key: test_mcc
|
|
value: [0.65273779 0.35805744 0.43386092 0.3380617 0.42085785 0.58615222
|
|
0.73720764 0.66651249 0.62044826 0.50908452]
|
|
|
|
mean value: 0.5322980837799182
|
|
|
|
key: train_mcc
|
|
value: [0.68348613 0.29359034 0.59278749 0.4467915 0.44671605 0.74603073
|
|
0.6997302 0.69671464 0.56654792 0.69518732]
|
|
|
|
mean value: 0.5867582318483535
|
|
|
|
key: test_accuracy
|
|
value: [0.80681818 0.61363636 0.68181818 0.63636364 0.66666667 0.79310345
|
|
0.86206897 0.81609195 0.79310345 0.74712644]
|
|
|
|
mean value: 0.7416797283176594
|
|
|
|
key: train_accuracy
|
|
value: [0.82569975 0.58142494 0.76463104 0.67557252 0.67090216 0.8729352
|
|
0.84879288 0.83227446 0.75984752 0.8386277 ]
|
|
|
|
mean value: 0.7670708168035927
|
|
|
|
key: test_fscore
|
|
value: [0.83495146 0.37037037 0.75 0.48387097 0.50847458 0.79069767
|
|
0.87234043 0.84313725 0.75675676 0.71794872]
|
|
|
|
mean value: 0.6928548200252127
|
|
|
|
key: train_fscore
|
|
value: [0.84861878 0.2832244 0.807892 0.53038674 0.51588785 0.87179487
|
|
0.85470085 0.8539823 0.69952305 0.81779053]
|
|
|
|
mean value: 0.708380139104571
|
|
|
|
key: test_precision
|
|
value: [0.72881356 1. 0.61764706 0.83333333 0.9375 0.79069767
|
|
0.80392157 0.74137931 0.93333333 0.82352941]
|
|
|
|
mean value: 0.8210155249967819
|
|
|
|
key: train_precision
|
|
value: [0.75 0.98484848 0.68245614 0.96 0.9787234 0.88082902
|
|
0.82352941 0.7553816 0.93220339 0.9375 ]
|
|
|
|
mean value: 0.868547145129061
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.22727273 0.95454545 0.34090909 0.34883721 0.79069767
|
|
0.95348837 0.97727273 0.63636364 0.63636364]
|
|
|
|
mean value: 0.6843023255813954
|
|
|
|
key: train_recall
|
|
value: [0.97709924 0.1653944 0.98982188 0.36641221 0.35025381 0.86294416
|
|
0.88832487 0.9821883 0.55979644 0.72519084]
|
|
|
|
mean value: 0.6867426150527635
|
|
|
|
key: test_roc_auc
|
|
value: [0.80681818 0.61363636 0.68181818 0.63636364 0.66305497 0.79307611
|
|
0.86310782 0.81421776 0.794926 0.74841438]
|
|
|
|
mean value: 0.7415433403805497
|
|
|
|
key: train_roc_auc
|
|
value: [0.82569975 0.58142494 0.76463104 0.67557252 0.67131011 0.87294791
|
|
0.84874259 0.83246471 0.75959365 0.83848374]
|
|
|
|
mean value: 0.7670870952325596
|
|
|
|
key: test_jcc
|
|
value: [0.71666667 0.22727273 0.6 0.31914894 0.34090909 0.65384615
|
|
0.77358491 0.72881356 0.60869565 0.56 ]
|
|
|
|
mean value: 0.5528937692021176
|
|
|
|
key: train_jcc
|
|
value: [0.73704415 0.16497462 0.67770035 0.36090226 0.34760705 0.77272727
|
|
0.74626866 0.74517375 0.53789731 0.69174757]
|
|
|
|
mean value: 0.5782042980076957
|
|
|
|
MCC on Blind test: 0.56
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25977039 0.24408913 0.24532914 0.24836373 0.24611473 0.24944425
|
|
0.25362062 0.25479889 0.24295735 0.24418402]
|
|
|
|
mean value: 0.24886722564697267
|
|
|
|
key: score_time
|
|
value: [0.0159719 0.01603484 0.01604414 0.01655626 0.01659632 0.01704359
|
|
0.01575017 0.01572776 0.0158062 0.01587892]
|
|
|
|
mean value: 0.016141009330749512
|
|
|
|
key: test_mcc
|
|
value: [0.79566006 0.70618882 0.79730996 0.73029674 0.68066848 0.83932347
|
|
0.7951307 0.79480784 0.79334038 0.81606765]
|
|
|
|
mean value: 0.7748794103853376
|
|
|
|
key: train_mcc
|
|
value: [0.8909343 0.89850975 0.88584325 0.89602536 0.89120603 0.89603807
|
|
0.90102964 0.88823013 0.88836928 0.88587654]
|
|
|
|
mean value: 0.8922062350735477
|
|
|
|
key: test_accuracy
|
|
value: [0.89772727 0.85227273 0.89772727 0.86363636 0.83908046 0.91954023
|
|
0.89655172 0.89655172 0.89655172 0.90804598]
|
|
|
|
mean value: 0.8867685475444096
|
|
|
|
key: train_accuracy
|
|
value: [0.94529262 0.94910941 0.94274809 0.94783715 0.94536213 0.94790343
|
|
0.95044473 0.94409149 0.94409149 0.94282084]
|
|
|
|
mean value: 0.9459701381546828
|
|
|
|
key: test_fscore
|
|
value: [0.89655172 0.85714286 0.9010989 0.86956522 0.82926829 0.91954023
|
|
0.8988764 0.9010989 0.89655172 0.90909091]
|
|
|
|
mean value: 0.8878785161161101
|
|
|
|
key: train_fscore
|
|
value: [0.94604768 0.94974874 0.94353827 0.94855709 0.9463171 0.94855709
|
|
0.9509434 0.9443038 0.94458438 0.94339623]
|
|
|
|
mean value: 0.9465993775790983
|
|
|
|
key: test_precision
|
|
value: [0.90697674 0.82978723 0.87234043 0.83333333 0.87179487 0.90909091
|
|
0.86956522 0.87234043 0.90697674 0.90909091]
|
|
|
|
mean value: 0.8781296814179803
|
|
|
|
key: train_precision
|
|
value: [0.93316832 0.93796526 0.93069307 0.93564356 0.93120393 0.93796526
|
|
0.94264339 0.9395466 0.93516209 0.93283582]
|
|
|
|
mean value: 0.9356827309466825
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.88636364 0.93181818 0.90909091 0.79069767 0.93023256
|
|
0.93023256 0.93181818 0.88636364 0.90909091]
|
|
|
|
mean value: 0.8992071881606765
|
|
|
|
key: train_recall
|
|
value: [0.95928753 0.96183206 0.956743 0.96183206 0.96192893 0.95939086
|
|
0.95939086 0.94910941 0.95419847 0.95419847]
|
|
|
|
mean value: 0.9577911677710182
|
|
|
|
key: test_roc_auc
|
|
value: [0.89772727 0.85227273 0.89772727 0.86363636 0.83853066 0.91966173
|
|
0.89693446 0.89614165 0.89667019 0.90803383]
|
|
|
|
mean value: 0.8867336152219872
|
|
|
|
key: train_roc_auc
|
|
value: [0.94529262 0.94910941 0.94274809 0.94783715 0.94534106 0.94788882
|
|
0.95043334 0.94409785 0.94410431 0.94283528]
|
|
|
|
mean value: 0.9459687939964609
|
|
|
|
key: test_jcc
|
|
value: [0.8125 0.75 0.82 0.76923077 0.70833333 0.85106383
|
|
0.81632653 0.82 0.8125 0.83333333]
|
|
|
|
mean value: 0.7993287796296915
|
|
|
|
key: train_jcc
|
|
value: [0.89761905 0.90430622 0.89311164 0.90214797 0.89810427 0.90214797
|
|
0.90647482 0.89448441 0.89498807 0.89285714]
|
|
|
|
mean value: 0.8986241557090046
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.22399497 0.2275095 0.24593782 0.22993827 0.23426127 0.1324122
|
|
0.22641897 0.228791 0.22016549 0.2273438 ]
|
|
|
|
mean value: 0.21967732906341553
|
|
|
|
key: score_time
|
|
value: [0.03998446 0.04093313 0.03987241 0.04082918 0.04121041 0.0396142
|
|
0.02495956 0.04249907 0.04032087 0.03757215]
|
|
|
|
mean value: 0.03877954483032227
|
|
|
|
key: test_mcc
|
|
value: [0.86452993 0.82589664 0.81818182 0.79566006 0.79334038 0.74735729
|
|
0.83923862 0.77312462 0.81702814 0.81606765]
|
|
|
|
mean value: 0.809042516543566
|
|
|
|
key: train_mcc
|
|
value: [0.99239533 0.99492383 0.98982188 0.98730612 0.98476502 0.97738462
|
|
0.98729673 0.98732207 0.98480289 0.98480289]
|
|
|
|
mean value: 0.9870821365169552
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.90909091 0.90909091 0.89772727 0.89655172 0.87356322
|
|
0.91954023 0.88505747 0.90804598 0.90804598]
|
|
|
|
mean value: 0.9038531870428422
|
|
|
|
key: train_accuracy
|
|
value: [0.99618321 0.99745547 0.99491094 0.99363868 0.99237611 0.98856417
|
|
0.99364676 0.99364676 0.99237611 0.99237611]
|
|
|
|
mean value: 0.9935174318037059
|
|
|
|
key: test_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[0.93333333 0.91489362 0.90909091 0.8988764 0.89655172 0.87356322
|
|
0.91764706 0.89130435 0.90697674 0.90909091]
|
|
|
|
mean value: 0.9051328266395209
|
|
|
|
key: train_fscore
|
|
value: [0.99616858 0.99746193 0.99491094 0.9936143 0.99236641 0.98844673
|
|
0.99364676 0.9936143 0.99232737 0.99232737]
|
|
|
|
mean value: 0.9934884690795172
|
|
|
|
key: test_precision
|
|
value: [0.91304348 0.86 0.90909091 0.88888889 0.88636364 0.86363636
|
|
0.92857143 0.85416667 0.92857143 0.90909091]
|
|
|
|
mean value: 0.8941423709141101
|
|
|
|
key: train_precision
|
|
value: [1. 0.99493671 0.99491094 0.9974359 0.99489796 1.
|
|
0.99491094 0.9974359 0.99742931 0.99742931]
|
|
|
|
mean value: 0.9969386957693075
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.97727273 0.90909091 0.90909091 0.90697674 0.88372093
|
|
0.90697674 0.93181818 0.88636364 0.90909091]
|
|
|
|
mean value: 0.9174947145877378
|
|
|
|
key: train_recall
|
|
value: [0.99236641 1. 0.99491094 0.98982188 0.98984772 0.97715736
|
|
0.99238579 0.98982188 0.98727735 0.98727735]
|
|
|
|
mean value: 0.9900866689916172
|
|
|
|
key: test_roc_auc
|
|
value: [0.93181818 0.90909091 0.90909091 0.89772727 0.89667019 0.87367865
|
|
0.91939746 0.88451374 0.9082981 0.90803383]
|
|
|
|
mean value: 0.9038319238900634
|
|
|
|
key: train_roc_auc
|
|
value: [0.99618321 0.99745547 0.99491094 0.99363868 0.99237933 0.98857868
|
|
0.99364836 0.99364191 0.99236964 0.99236964]
|
|
|
|
mean value: 0.9935175856679712
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.84313725 0.83333333 0.81632653 0.8125 0.7755102
|
|
0.84782609 0.80392157 0.82978723 0.83333333]
|
|
|
|
mean value: 0.8270675545889031
|
|
|
|
key: train_jcc
|
|
value: [0.99236641 0.99493671 0.98987342 0.98730964 0.98484848 0.97715736
|
|
0.98737374 0.98730964 0.98477157 0.98477157]
|
|
|
|
mean value: 0.9870718557972555
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.34912157 0.47804213 0.33337474 0.45919132 0.30793595 0.29898429
|
|
0.39014459 0.32371235 0.44865108 0.29094195]
|
|
|
|
mean value: 0.3680099964141846
|
|
|
|
key: score_time
|
|
value: [0.03037262 0.0336256 0.01945448 0.01939178 0.01974702 0.01969528
|
|
0.0238955 0.019871 0.03380036 0.03445649]
|
|
|
|
mean value: 0.025431013107299803
|
|
|
|
key: test_mcc
|
|
value: [0.66759342 0.50471461 0.54601891 0.57551157 0.5504913 0.50908452
|
|
0.47273749 0.45482695 0.56980678 0.54466285]
|
|
|
|
mean value: 0.5395448399020225
|
|
|
|
key: train_mcc
|
|
value: [0.92972888 0.93021184 0.92995819 0.92243746 0.91770005 0.9198622
|
|
0.92011198 0.92012314 0.92818308 0.91987241]
|
|
|
|
mean value: 0.9238189213913246
|
|
|
|
key: test_accuracy
|
|
value: [0.82954545 0.75 0.77272727 0.78409091 0.77011494 0.74712644
|
|
0.73563218 0.72413793 0.7816092 0.77011494]
|
|
|
|
mean value: 0.7665099268547544
|
|
|
|
key: train_accuracy
|
|
value: [0.96437659 0.96437659 0.96437659 0.9605598 0.95806861 0.95933926
|
|
0.95933926 0.95933926 0.96315121 0.95933926]
|
|
|
|
mean value: 0.961226644163587
|
|
|
|
key: test_fscore
|
|
value: [0.84210526 0.76595745 0.76744186 0.8 0.78723404 0.77083333
|
|
0.74157303 0.75 0.8 0.78723404]
|
|
|
|
mean value: 0.7812379022579103
|
|
|
|
key: train_fscore
|
|
value: [0.96517413 0.96534653 0.96526055 0.96158612 0.95930949 0.96039604
|
|
0.96049383 0.96039604 0.96424168 0.96029777]
|
|
|
|
mean value: 0.9622502175860964
|
|
|
|
key: test_precision
|
|
value: [0.78431373 0.72 0.78571429 0.74509804 0.7254902 0.69811321
|
|
0.7173913 0.69230769 0.74509804 0.74 ]
|
|
|
|
mean value: 0.7353526489916974
|
|
|
|
key: train_precision
|
|
value: [0.94403893 0.93975904 0.94188862 0.93719807 0.93285372 0.93719807
|
|
0.93509615 0.93493976 0.9354067 0.937046 ]
|
|
|
|
mean value: 0.9375425054021276
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.81818182 0.75 0.86363636 0.86046512 0.86046512
|
|
0.76744186 0.81818182 0.86363636 0.84090909]
|
|
|
|
mean value: 0.835200845665962
|
|
|
|
key: train_recall
|
|
value: [0.98727735 0.99236641 0.98982188 0.98727735 0.98730964 0.98477157
|
|
0.98730964 0.98727735 0.99491094 0.98473282]
|
|
|
|
mean value: 0.9883054985081567
|
|
|
|
key: test_roc_auc
|
|
value: [0.82954545 0.75 0.77272727 0.78409091 0.77114165 0.74841438
|
|
0.73599366 0.7230444 0.78065539 0.76929175]
|
|
|
|
mean value: 0.7664904862579281
|
|
|
|
key: train_roc_auc
|
|
value: [0.96437659 0.96437659 0.96437659 0.9605598 0.95803141 0.95930691
|
|
0.95930368 0.95937472 0.96319151 0.95937149]
|
|
|
|
mean value: 0.9612269280944447
|
|
|
|
key: test_jcc
|
|
value: [0.72727273 0.62068966 0.62264151 0.66666667 0.64912281 0.62711864
|
|
0.58928571 0.6 0.66666667 0.64912281]
|
|
|
|
mean value: 0.6418587197601036
|
|
|
|
key: train_jcc
|
|
value: [0.93269231 0.93301435 0.93285372 0.92601432 0.92180095 0.92380952
|
|
0.9239905 0.92380952 0.93095238 0.92362768]
|
|
|
|
mean value: 0.927256525881002
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.08939147 1.07237315 1.0702858 1.06541538 1.07314181 1.07619619
|
|
1.07927656 1.07755518 1.0714097 1.0669136 ]
|
|
|
|
mean value: 1.074195885658264
|
|
|
|
key: score_time
|
|
value: [0.01029015 0.00958037 0.01048279 0.00957704 0.00974631 0.00978112
|
|
0.00978923 0.00957084 0.00987887 0.00948596]
|
|
|
|
mean value: 0.009818267822265626
|
|
|
|
key: test_mcc
|
|
value: [0.88659264 0.84639167 0.81818182 0.84287052 0.81702814 0.86303555
|
|
0.93329922 0.79480784 0.81702814 0.83923862]
|
|
|
|
mean value: 0.8458474162871303
|
|
|
|
key: train_mcc
|
|
value: [0.9567461 0.9567461 0.94912171 0.96437659 0.96443403 0.95695029
|
|
0.97207422 0.96696611 0.94918593 0.95426922]
|
|
|
|
mean value: 0.959087029355719
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.92045455 0.90909091 0.92045455 0.90804598 0.93103448
|
|
0.96551724 0.89655172 0.90804598 0.91954023]
|
|
|
|
mean value: 0.9221917450365726
|
|
|
|
key: train_accuracy
|
|
value: [0.9783715 0.9783715 0.97455471 0.9821883 0.98221093 0.97839898
|
|
0.98602287 0.98348158 0.97458704 0.97712834]
|
|
|
|
mean value: 0.9795315738252972
|
|
|
|
key: test_fscore
|
|
value: [0.94252874 0.92473118 0.90909091 0.92307692 0.90909091 0.93181818
|
|
0.96629213 0.9010989 0.90697674 0.92134831]
|
|
|
|
mean value: 0.9236052936227956
|
|
|
|
key: train_fscore
|
|
value: [0.97834395 0.97834395 0.9744898 0.9821883 0.98227848 0.97823303
|
|
0.98598726 0.98343949 0.9744898 0.97715736]
|
|
|
|
mean value: 0.979495141267347
|
|
|
|
key: test_precision
|
|
value: [0.95348837 0.87755102 0.90909091 0.89361702 0.88888889 0.91111111
|
|
0.93478261 0.87234043 0.92857143 0.91111111]
|
|
|
|
mean value: 0.9080552896778799
|
|
|
|
key: train_precision
|
|
value: [0.97959184 0.97959184 0.9769821 0.9821883 0.97979798 0.9870801
|
|
0.98976982 0.98469388 0.9769821 0.97468354]
|
|
|
|
mean value: 0.9811361488992022
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.97727273 0.90909091 0.95454545 0.93023256 0.95348837
|
|
1. 0.93181818 0.88636364 0.93181818]
|
|
|
|
mean value: 0.9406448202959831
|
|
|
|
key: train_recall
|
|
value: [0.97709924 0.97709924 0.97201018 0.9821883 0.98477157 0.96954315
|
|
0.9822335 0.9821883 0.97201018 0.97964377]
|
|
|
|
mean value: 0.977878740910089
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.92045455 0.90909091 0.92045455 0.9082981 0.93128964
|
|
0.96590909 0.89614165 0.9082981 0.91939746]
|
|
|
|
mean value: 0.9222515856236786
|
|
|
|
key: train_roc_auc
|
|
value: [0.9783715 0.9783715 0.97455471 0.9821883 0.98220767 0.97841025
|
|
0.98602769 0.98347993 0.97458377 0.97713153]
|
|
|
|
mean value: 0.9795326849304452
|
|
|
|
key: test_jcc
|
|
value: [0.89130435 0.86 0.83333333 0.85714286 0.83333333 0.87234043
|
|
0.93478261 0.82 0.82978723 0.85416667]
|
|
|
|
mean value: 0.8586190806572398
|
|
|
|
key: train_jcc
|
|
value: [0.95760599 0.95760599 0.95024876 0.965 0.96517413 0.95739348
|
|
0.97236181 0.96741855 0.95024876 0.95533499]
|
|
|
|
mean value: 0.9598392438579324
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03878379 0.03929353 0.03901601 0.04018807 0.04464102 0.04195595
|
|
0.03913212 0.03936505 0.04722381 0.04217124]
|
|
|
|
mean value: 0.04117705821990967
|
|
|
|
key: score_time
|
|
value: [0.01283026 0.01283789 0.01293039 0.01398873 0.01305914 0.01290655
|
|
0.01287484 0.01305056 0.01294804 0.01292896]
|
|
|
|
mean value: 0.01303553581237793
|
|
|
|
key: test_mcc
|
|
value: [0.14322297 0.12598816 0.10910895 0.03750293 0.15546399 0.04655125
|
|
0.21701954 0.2497872 0.15100772 0.19262997]
|
|
|
|
mean value: 0.14282826822070938
|
|
|
|
key: train_mcc
|
|
value: [0.24351425 0.24932341 0.24643203 0.25784831 0.2316976 0.24947191
|
|
0.23773786 0.22804979 0.25734636 0.23413734]
|
|
|
|
mean value: 0.24355588486096966
|
|
|
|
key: test_accuracy
|
|
value: [0.54545455 0.53409091 0.52272727 0.51136364 0.54022989 0.50574713
|
|
0.54022989 0.56321839 0.54022989 0.56321839]
|
|
|
|
mean value: 0.5366509926854754
|
|
|
|
key: train_accuracy
|
|
value: [0.55597964 0.55852417 0.55725191 0.56234097 0.55146125 0.55908513
|
|
0.55400254 0.54891995 0.56162643 0.55146125]
|
|
|
|
mean value: 0.5560653235949317
|
|
|
|
key: test_fscore
|
|
value: [0.67213115 0.672 0.671875 0.6504065 0.67213115 0.656
|
|
0.68253968 0.6984127 0.68253968 0.68852459]
|
|
|
|
mean value: 0.6746560452803005
|
|
|
|
key: train_fscore
|
|
value: [0.69251101 0.69373345 0.69312169 0.69557522 0.69062226 0.69427313
|
|
0.69183494 0.68886941 0.69496021 0.69007902]
|
|
|
|
mean value: 0.6925580352130288
|
|
|
|
key: test_precision
|
|
value: [0.52564103 0.51851852 0.51190476 0.50632911 0.51898734 0.5
|
|
0.51807229 0.53658537 0.52439024 0.53846154]
|
|
|
|
mean value: 0.5198890199134771
|
|
|
|
key: train_precision
|
|
value: [0.5296496 0.53108108 0.53036437 0.53324288 0.52744311 0.5317139
|
|
0.52885906 0.52540107 0.53252033 0.52680965]
|
|
|
|
mean value: 0.5297085038255003
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.95454545 0.97727273 0.90909091 0.95348837 0.95348837
|
|
1. 1. 0.97727273 0.95454545]
|
|
|
|
mean value: 0.9611522198731501
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.54545455 0.53409091 0.52272727 0.51136364 0.544926 0.5108351
|
|
0.54545455 0.55813953 0.53514799 0.55866808]
|
|
|
|
mean value: 0.5366807610993658
|
|
|
|
key: train_roc_auc
|
|
value: [0.55597964 0.55852417 0.55725191 0.56234097 0.55089059 0.55852417
|
|
0.55343511 0.54949239 0.56218274 0.55203046]
|
|
|
|
mean value: 0.5560652148641841
|
|
|
|
key: test_jcc
|
|
value: [0.50617284 0.5060241 0.50588235 0.48192771 0.50617284 0.48809524
|
|
0.51807229 0.53658537 0.51807229 0.525 ]
|
|
|
|
mean value: 0.5092005021444588
|
|
|
|
key: train_jcc
|
|
value: [0.5296496 0.53108108 0.53036437 0.53324288 0.52744311 0.5317139
|
|
0.52885906 0.52540107 0.53252033 0.52680965]
|
|
|
|
mean value: 0.5297085038255003
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01776242 0.01769638 0.01799273 0.04317951 0.04338765 0.04285359
|
|
0.03382993 0.02006269 0.01762819 0.04450798]
|
|
|
|
mean value: 0.029890108108520507
|
|
|
|
key: score_time
|
|
value: [0.01360846 0.012321 0.01231194 0.01928067 0.01912475 0.01911569
|
|
0.01230025 0.01963973 0.0123415 0.01916599]
|
|
|
|
mean value: 0.01592099666595459
|
|
|
|
key: test_mcc
|
|
value: [0.64236405 0.62155249 0.47838597 0.57188626 0.65696218 0.54295079
|
|
0.67038474 0.62173301 0.58655447 0.65539112]
|
|
|
|
mean value: 0.6048165102926117
|
|
|
|
key: train_mcc
|
|
value: [0.73960469 0.73336325 0.72349182 0.73847379 0.73094792 0.7156377
|
|
0.72143309 0.74631504 0.73645742 0.72660979]
|
|
|
|
mean value: 0.7312334508927151
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.80681818 0.73863636 0.78409091 0.82758621 0.77011494
|
|
0.82758621 0.8045977 0.79310345 0.82758621]
|
|
|
|
mean value: 0.799830198537095
|
|
|
|
key: train_accuracy
|
|
value: [0.86768448 0.86513995 0.86005089 0.86768448 0.86404066 0.85641677
|
|
0.85895807 0.87166455 0.86658196 0.86149936]
|
|
|
|
mean value: 0.8639721168737532
|
|
|
|
key: test_fscore
|
|
value: [0.82978723 0.82105263 0.74725275 0.79569892 0.83146067 0.77777778
|
|
0.84210526 0.82474227 0.8 0.82758621]
|
|
|
|
mean value: 0.8097463727636196
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:156: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:159: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.87439614 0.87104623 0.86650485 0.87347932 0.86998785 0.86269745
|
|
0.86577993 0.87697929 0.87241798 0.86787879]
|
|
|
|
mean value: 0.870116782663218
|
|
|
|
key: test_precision
|
|
value: [0.78 0.76470588 0.72340426 0.75510204 0.80434783 0.74468085
|
|
0.76923077 0.75471698 0.7826087 0.8372093 ]
|
|
|
|
mean value: 0.7716006603979804
|
|
|
|
key: train_precision
|
|
value: [0.83218391 0.83449883 0.82830626 0.83682984 0.83449883 0.82750583
|
|
0.82678984 0.8411215 0.83488372 0.8287037 ]
|
|
|
|
mean value: 0.8325322264178692
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.88636364 0.77272727 0.84090909 0.86046512 0.81395349
|
|
0.93023256 0.90909091 0.81818182 0.81818182]
|
|
|
|
mean value: 0.8536469344608879
|
|
|
|
key: train_recall
|
|
value: [0.92111959 0.91094148 0.90839695 0.91348601 0.90862944 0.90101523
|
|
0.90862944 0.91603053 0.91348601 0.91094148]
|
|
|
|
mean value: 0.9112676147298536
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.80681818 0.73863636 0.78409091 0.82795983 0.77061311
|
|
0.82875264 0.80338266 0.79281184 0.82769556]
|
|
|
|
mean value: 0.7998942917547569
|
|
|
|
key: train_roc_auc
|
|
value: [0.86768448 0.86513995 0.86005089 0.86768448 0.86398393 0.85636003
|
|
0.85889487 0.87172085 0.86664148 0.86156211]
|
|
|
|
mean value: 0.8639723072551375
|
|
|
|
key: test_jcc
|
|
value: [0.70909091 0.69642857 0.59649123 0.66071429 0.71153846 0.63636364
|
|
0.72727273 0.70175439 0.66666667 0.70588235]
|
|
|
|
mean value: 0.6812203225051522
|
|
|
|
key: train_jcc
|
|
value: [0.77682403 0.77155172 0.76445396 0.77537797 0.76989247 0.75854701
|
|
0.76332623 0.78091106 0.7737069 0.76659529]
|
|
|
|
mean value: 0.7701186645906976
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.33544517 0.34822321 0.32937121 0.32755113 0.33985925 0.4299016
|
|
0.33088613 0.3283422 0.32882071 0.38829613]
|
|
|
|
mean value: 0.3486696720123291
|
|
|
|
key: score_time
|
|
value: [0.01930285 0.01924634 0.01923442 0.01921296 0.01925468 0.01923203
|
|
0.01920033 0.01935935 0.01922727 0.01912498]
|
|
|
|
mean value: 0.019239521026611327
|
|
|
|
key: test_mcc
|
|
value: [0.64236405 0.61763716 0.47838597 0.57188626 0.65696218 0.54295079
|
|
0.67803941 0.64863047 0.58655447 0.60940803]
|
|
|
|
mean value: 0.6032818808066384
|
|
|
|
key: train_mcc
|
|
value: [0.73960469 0.75496449 0.72349182 0.73847379 0.73094792 0.7156377
|
|
0.75012172 0.77148345 0.73645742 0.73645742]
|
|
|
|
mean value: 0.7397640414438856
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.80681818 0.73863636 0.78409091 0.82758621 0.77011494
|
|
0.82758621 0.81609195 0.79310345 0.8045977 ]
|
|
|
|
mean value: 0.7986807732497387
|
|
|
|
key: train_accuracy
|
|
value: [0.86768448 0.87659033 0.86005089 0.86768448 0.86404066 0.85641677
|
|
0.8729352 0.88437103 0.86658196 0.86658196]
|
|
|
|
mean value: 0.868293775117931
|
|
|
|
key: test_fscore
|
|
value: [0.82978723 0.8172043 0.74725275 0.79569892 0.83146067 0.77777778
|
|
0.84536082 0.83673469 0.8 0.8045977 ]
|
|
|
|
mean value: 0.8085874878806077
|
|
|
|
key: train_fscore
|
|
value: [0.87439614 0.88068881 0.86650485 0.87347932 0.86998785 0.86269745
|
|
0.87951807 0.88888889 0.87241798 0.87241798]
|
|
|
|
mean value: 0.8740997340105042
|
|
|
|
key: test_precision
|
|
value: [0.78 0.7755102 0.72340426 0.75510204 0.80434783 0.74468085
|
|
0.75925926 0.75925926 0.7826087 0.81395349]
|
|
|
|
mean value: 0.769812587991068
|
|
|
|
key: train_precision
|
|
value: [0.83218391 0.85238095 0.82830626 0.83682984 0.83449883 0.82750583
|
|
0.83715596 0.85446009 0.83488372 0.83488372]
|
|
|
|
mean value: 0.8373089122822519
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.86363636 0.77272727 0.84090909 0.86046512 0.81395349
|
|
0.95348837 0.93181818 0.81818182 0.79545455]
|
|
|
|
mean value: 0.8536997885835095
|
|
|
|
key: train_recall
|
|
value: [0.92111959 0.91094148 0.90839695 0.91348601 0.90862944 0.90101523
|
|
0.92639594 0.92620865 0.91348601 0.91348601]
|
|
|
|
mean value: 0.9143165291070898
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.80681818 0.73863636 0.78409091 0.82795983 0.77061311
|
|
0.82901691 0.8147463 0.79281184 0.80470402]
|
|
|
|
mean value: 0.7987579281183932
|
|
|
|
key: train_roc_auc
|
|
value: [0.86768448 0.87659033 0.86005089 0.86768448 0.86398393 0.85636003
|
|
0.87286718 0.88442412 0.86664148 0.86664148]
|
|
|
|
mean value: 0.8682928404438072
|
|
|
|
key: test_jcc
|
|
value: [0.70909091 0.69090909 0.59649123 0.66071429 0.71153846 0.63636364
|
|
0.73214286 0.71929825 0.66666667 0.67307692]
|
|
|
|
mean value: 0.6796292304187042
|
|
|
|
key: train_jcc
|
|
value: [0.77682403 0.78681319 0.76445396 0.77537797 0.76989247 0.75854701
|
|
0.78494624 0.8 0.7737069 0.7737069 ]
|
|
|
|
mean value: 0.7764268663694349
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04968023 0.03789711 0.07179022 0.05948281 0.05385971 0.05733061
|
|
0.03771996 0.03850865 0.0391221 0.03841591]
|
|
|
|
mean value: 0.04838073253631592
|
|
|
|
key: score_time
|
|
value: [0.01359701 0.01272964 0.02741122 0.01486683 0.01206851 0.01363039
|
|
0.0127759 0.0128696 0.0128572 0.01296473]
|
|
|
|
mean value: 0.014577102661132813
|
|
|
|
key: test_mcc
|
|
value: [0.44539933 0.47082362 0.62622429 0.40644851 0.5809475 0.59404013
|
|
0.5957539 0.61895161 0.49193548 0.52371369]
|
|
|
|
mean value: 0.5354238070669484
|
|
|
|
key: train_mcc
|
|
value: [0.70137886 0.69003127 0.68667602 0.69273127 0.65895585 0.66630992
|
|
0.67291968 0.66568726 0.65836157 0.69060208]
|
|
|
|
mean value: 0.678365378762692
|
|
|
|
key: test_accuracy
|
|
value: [0.71875 0.734375 0.8125 0.703125 0.78125 0.796875
|
|
0.79365079 0.80952381 0.74603175 0.76190476]
|
|
|
|
mean value: 0.7657986111111111
|
|
|
|
key: train_accuracy
|
|
value: [0.84965035 0.84440559 0.84265734 0.84615385 0.82867133 0.83216783
|
|
0.83595113 0.83246073 0.82897033 0.84467714]
|
|
|
|
mean value: 0.8385765630530029
|
|
|
|
key: test_fscore
|
|
value: [0.74285714 0.72131148 0.80645161 0.70769231 0.80555556 0.8
|
|
0.80597015 0.80645161 0.75 0.76923077]
|
|
|
|
mean value: 0.7715520625805794
|
|
|
|
key: train_fscore
|
|
value: [0.85521886 0.84889643 0.84745763 0.84879725 0.83445946 0.83838384
|
|
0.84067797 0.83673469 0.83161512 0.84889643]
|
|
|
|
mean value: 0.8431137680564013
|
|
|
|
key: test_precision
|
|
value: [0.68421053 0.75862069 0.83333333 0.6969697 0.725 0.78787879
|
|
0.75 0.80645161 0.75 0.75757576]
|
|
|
|
mean value: 0.7550040404631764
|
|
|
|
key: train_precision
|
|
value: [0.82467532 0.82508251 0.82236842 0.83445946 0.80718954 0.80844156
|
|
0.81848185 0.81727575 0.81756757 0.82508251]
|
|
|
|
mean value: 0.8200624485874977
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.6875 0.78125 0.71875 0.90625 0.8125
|
|
0.87096774 0.80645161 0.75 0.78125 ]
|
|
|
|
mean value: 0.792741935483871
|
|
|
|
key: train_recall
|
|
value: [0.88811189 0.87412587 0.87412587 0.86363636 0.86363636 0.87062937
|
|
0.8641115 0.85714286 0.84615385 0.87412587]
|
|
|
|
mean value: 0.8675799809946152
|
|
|
|
key: test_roc_auc
|
|
value: [0.71875 0.734375 0.8125 0.703125 0.78125 0.796875
|
|
0.79485887 0.80947581 0.74596774 0.76159274]
|
|
|
|
mean value: 0.7658770161290323
|
|
|
|
key: train_roc_auc
|
|
value: [0.84965035 0.84440559 0.84265734 0.84615385 0.82867133 0.83216783
|
|
0.8359019 0.83241758 0.82900027 0.84472844]
|
|
|
|
mean value: 0.8385754489413026
|
|
|
|
key: test_jcc
|
|
value: [0.59090909 0.56410256 0.67567568 0.54761905 0.6744186 0.66666667
|
|
0.675 0.67567568 0.6 0.625 ]
|
|
|
|
mean value: 0.6295067325299883
|
|
|
|
key: train_jcc
|
|
value: [0.74705882 0.73746313 0.73529412 0.73731343 0.71594203 0.72173913
|
|
0.7251462 0.71929825 0.71176471 0.73746313]
|
|
|
|
mean value: 0.7288482937446694
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.81775093 0.97031665 0.86882544 0.84967637 1.06103253 1.10017157
|
|
1.02854228 0.88756537 0.91167998 0.95300126]
|
|
|
|
mean value: 0.9448562383651733
|
|
|
|
key: score_time
|
|
value: [0.01471949 0.01237798 0.01467729 0.01467299 0.01483011 0.01496291
|
|
0.01218915 0.01475215 0.01921082 0.01462936]
|
|
|
|
mean value: 0.014702224731445312
|
|
|
|
key: test_mcc
|
|
value: [0.4163332 0.50097943 0.625 0.53150959 0.474579 0.65657067
|
|
0.62475802 0.61982085 0.5570134 0.52371369]
|
|
|
|
mean value: 0.5530277845963453
|
|
|
|
key: train_mcc
|
|
value: [0.73177639 0.74552581 0.72871741 0.72814564 0.80829038 0.79480019
|
|
0.74257475 0.79849292 0.79827133 0.73925674]
|
|
|
|
mean value: 0.7615851555815423
|
|
|
|
key: test_accuracy
|
|
value: [0.703125 0.75 0.8125 0.765625 0.734375 0.828125
|
|
0.80952381 0.80952381 0.77777778 0.76190476]
|
|
|
|
mean value: 0.7752480158730158
|
|
|
|
key: train_accuracy
|
|
value: [0.86538462 0.87237762 0.86363636 0.86363636 0.90384615 0.89685315
|
|
0.87085515 0.89877836 0.89877836 0.86910995]
|
|
|
|
mean value: 0.8803256080742992
|
|
|
|
key: test_fscore
|
|
value: [0.73239437 0.74193548 0.8125 0.76923077 0.75362319 0.82539683
|
|
0.81818182 0.8 0.77419355 0.76923077]
|
|
|
|
mean value: 0.7796686768901226
|
|
|
|
key: train_fscore
|
|
value: [0.86882453 0.87521368 0.86779661 0.8668942 0.90566038 0.89948893
|
|
0.87414966 0.90136054 0.90068493 0.87223169]
|
|
|
|
mean value: 0.8832305141086446
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.76666667 0.8125 0.75757576 0.7027027 0.83870968
|
|
0.77142857 0.82758621 0.8 0.75757576]
|
|
|
|
mean value: 0.7701412006932029
|
|
|
|
key: train_precision
|
|
value: [0.84717608 0.85618729 0.84210526 0.84666667 0.88888889 0.87707641
|
|
0.8538206 0.88039867 0.88255034 0.85049834]
|
|
|
|
mean value: 0.8625368544921593
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.71875 0.8125 0.78125 0.8125 0.8125
|
|
0.87096774 0.77419355 0.75 0.78125 ]
|
|
|
|
mean value: 0.792641129032258
|
|
|
|
key: train_recall
|
|
value: [0.89160839 0.8951049 0.8951049 0.88811189 0.92307692 0.92307692
|
|
0.89547038 0.92334495 0.91958042 0.8951049 ]
|
|
|
|
mean value: 0.9049584561779683
|
|
|
|
key: test_roc_auc
|
|
value: [0.703125 0.75 0.8125 0.765625 0.734375 0.828125
|
|
0.81048387 0.80897177 0.77822581 0.76159274]
|
|
|
|
mean value: 0.7753024193548387
|
|
|
|
key: train_roc_auc
|
|
value: [0.86538462 0.87237762 0.86363636 0.86363636 0.90384615 0.89685315
|
|
0.87081211 0.89873541 0.8988146 0.86915524]
|
|
|
|
mean value: 0.8803251626422358
|
|
|
|
key: test_jcc
|
|
value: [0.57777778 0.58974359 0.68421053 0.625 0.60465116 0.7027027
|
|
0.69230769 0.66666667 0.63157895 0.625 ]
|
|
|
|
mean value: 0.6399639065673337
|
|
|
|
key: train_jcc
|
|
value: [0.76807229 0.7781155 0.76646707 0.76506024 0.82758621 0.81733746
|
|
0.77643505 0.82043344 0.81931464 0.7734139 ]
|
|
|
|
mean value: 0.7912235786580607
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01438069 0.0107131 0.01069641 0.01035666 0.0103178 0.01025701
|
|
0.01074266 0.01153016 0.01023555 0.01032662]
|
|
|
|
mean value: 0.01095566749572754
|
|
|
|
key: score_time
|
|
value: [0.01253676 0.00952673 0.00941396 0.00904965 0.00897145 0.00896692
|
|
0.00892973 0.00899005 0.00900006 0.00900984]
|
|
|
|
mean value: 0.009439516067504882
|
|
|
|
key: test_mcc
|
|
value: [0.46897905 0.46056619 0.35820928 0.25048972 0.19364917 0.28823068
|
|
0.3393548 0.33367758 0.49344122 0.4089525 ]
|
|
|
|
mean value: 0.35955501764481224
|
|
|
|
key: train_mcc
|
|
value: [0.37965772 0.4185499 0.39072156 0.46245528 0.41614768 0.36263664
|
|
0.37967833 0.37573579 0.4409701 0.39252541]
|
|
|
|
mean value: 0.40190783982214684
|
|
|
|
key: test_accuracy
|
|
value: [0.734375 0.71875 0.671875 0.625 0.59375 0.640625
|
|
0.66666667 0.66666667 0.74603175 0.6984127 ]
|
|
|
|
mean value: 0.6762152777777778
|
|
|
|
key: train_accuracy
|
|
value: [0.68531469 0.70804196 0.69405594 0.73076923 0.7027972 0.67657343
|
|
0.68760908 0.68237347 0.71553229 0.69284468]
|
|
|
|
mean value: 0.6975911958896253
|
|
|
|
key: test_fscore
|
|
value: [0.73015873 0.66666667 0.61818182 0.63636364 0.53571429 0.59649123
|
|
0.61818182 0.6440678 0.74193548 0.66666667]
|
|
|
|
mean value: 0.6454428130484935
|
|
|
|
key: train_fscore
|
|
value: [0.64705882 0.69131238 0.67532468 0.73898305 0.66535433 0.63510848
|
|
0.66290019 0.64031621 0.68101761 0.66023166]
|
|
|
|
mean value: 0.6697607412759368
|
|
|
|
key: test_precision
|
|
value: [0.74193548 0.81818182 0.73913043 0.61764706 0.625 0.68
|
|
0.70833333 0.67857143 0.76666667 0.76 ]
|
|
|
|
mean value: 0.7135466224230352
|
|
|
|
key: train_precision
|
|
value: [0.73660714 0.73333333 0.71936759 0.71710526 0.76126126 0.72850679
|
|
0.72131148 0.73972603 0.77333333 0.73706897]
|
|
|
|
mean value: 0.7367621178530426
|
|
|
|
key: test_recall
|
|
value: [0.71875 0.5625 0.53125 0.65625 0.46875 0.53125
|
|
0.5483871 0.61290323 0.71875 0.59375 ]
|
|
|
|
mean value: 0.5942540322580645
|
|
|
|
key: train_recall
|
|
value: [0.57692308 0.65384615 0.63636364 0.76223776 0.59090909 0.56293706
|
|
0.61324042 0.56445993 0.60839161 0.5979021 ]
|
|
|
|
mean value: 0.6167210837942545
|
|
|
|
key: test_roc_auc
|
|
value: [0.734375 0.71875 0.671875 0.625 0.59375 0.640625
|
|
0.66481855 0.66582661 0.74647177 0.70010081]
|
|
|
|
mean value: 0.6761592741935484
|
|
|
|
key: train_roc_auc
|
|
value: [0.68531469 0.70804196 0.69405594 0.73076923 0.7027972 0.67657343
|
|
0.68773909 0.68257962 0.71534563 0.69267927]
|
|
|
|
mean value: 0.6975896055164348
|
|
|
|
key: test_jcc
|
|
value: [0.575 0.5 0.44736842 0.46666667 0.36585366 0.425
|
|
0.44736842 0.475 0.58974359 0.5 ]
|
|
|
|
mean value: 0.4792000757052105
|
|
|
|
key: train_jcc
|
|
value: [0.47826087 0.52824859 0.50980392 0.58602151 0.49852507 0.46531792
|
|
0.49577465 0.47093023 0.51632047 0.49279539]
|
|
|
|
mean value: 0.5041998621174171
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01077843 0.01050925 0.01047873 0.01048326 0.0104928 0.01047349
|
|
0.01060224 0.01083875 0.01082993 0.01076794]
|
|
|
|
mean value: 0.010625481605529785
|
|
|
|
key: score_time
|
|
value: [0.00895715 0.00905252 0.00902224 0.00900888 0.00898409 0.0090363
|
|
0.00899148 0.0090642 0.00913739 0.00918102]
|
|
|
|
mean value: 0.009043526649475098
|
|
|
|
key: test_mcc
|
|
value: [0.31814238 0.34527065 0.44539933 0.34391797 0.42333825 0.62622429
|
|
0.36629686 0.42842742 0.33367758 0.49193548]
|
|
|
|
mean value: 0.4122630210919585
|
|
|
|
key: train_mcc
|
|
value: [0.53152212 0.51229647 0.48732947 0.48309663 0.4795166 0.51083262
|
|
0.46627502 0.46655817 0.5205467 0.5054141 ]
|
|
|
|
mean value: 0.4963387918137142
|
|
|
|
key: test_accuracy
|
|
value: [0.65625 0.671875 0.71875 0.671875 0.703125 0.8125
|
|
0.68253968 0.71428571 0.66666667 0.74603175]
|
|
|
|
mean value: 0.704389880952381
|
|
|
|
key: train_accuracy
|
|
value: [0.76398601 0.75524476 0.74300699 0.74125874 0.73951049 0.7534965
|
|
0.73298429 0.73298429 0.7591623 0.7521815 ]
|
|
|
|
mean value: 0.7473815887428453
|
|
|
|
key: test_fscore
|
|
value: [0.68571429 0.6557377 0.68965517 0.67692308 0.73972603 0.81818182
|
|
0.6875 0.70967742 0.68656716 0.75 ]
|
|
|
|
mean value: 0.709968266908221
|
|
|
|
key: train_fscore
|
|
value: [0.7768595 0.76510067 0.75210793 0.74744027 0.74529915 0.76771005
|
|
0.73846154 0.74023769 0.76923077 0.75932203]
|
|
|
|
mean value: 0.7561769601426575
|
|
|
|
key: test_precision
|
|
value: [0.63157895 0.68965517 0.76923077 0.66666667 0.65853659 0.79411765
|
|
0.66666667 0.70967742 0.65714286 0.75 ]
|
|
|
|
mean value: 0.6993272731268689
|
|
|
|
key: train_precision
|
|
value: [0.73667712 0.73548387 0.72638436 0.73 0.72909699 0.7258567
|
|
0.72483221 0.7218543 0.73717949 0.73684211]
|
|
|
|
mean value: 0.7304207151405426
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.625 0.6875 0.84375 0.84375
|
|
0.70967742 0.70967742 0.71875 0.75 ]
|
|
|
|
mean value: 0.7263104838709677
|
|
|
|
key: train_recall
|
|
value: [0.82167832 0.7972028 0.77972028 0.76573427 0.76223776 0.81468531
|
|
0.75261324 0.75958188 0.8041958 0.78321678]
|
|
|
|
mean value: 0.7840866450622548
|
|
|
|
key: test_roc_auc
|
|
value: [0.65625 0.671875 0.71875 0.671875 0.703125 0.8125
|
|
0.68296371 0.71421371 0.66582661 0.74596774]
|
|
|
|
mean value: 0.7043346774193548
|
|
|
|
key: train_roc_auc
|
|
value: [0.76398601 0.75524476 0.74300699 0.74125874 0.73951049 0.7534965
|
|
0.73294998 0.73293779 0.75924076 0.75223557]
|
|
|
|
mean value: 0.7473867595818815
|
|
|
|
key: test_jcc
|
|
value: [0.52173913 0.48780488 0.52631579 0.51162791 0.58695652 0.69230769
|
|
0.52380952 0.55 0.52272727 0.6 ]
|
|
|
|
mean value: 0.5523288715517611
|
|
|
|
key: train_jcc
|
|
value: [0.63513514 0.61956522 0.6027027 0.59673025 0.59400545 0.62299465
|
|
0.58536585 0.58760108 0.625 0.61202186]
|
|
|
|
mean value: 0.6081122192207598
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01007533 0.01112366 0.01109052 0.01143003 0.01140022 0.01098752
|
|
0.01147366 0.01116109 0.01144481 0.01098204]
|
|
|
|
mean value: 0.011116886138916015
|
|
|
|
key: score_time
|
|
value: [0.01728034 0.01390123 0.01371455 0.01369023 0.01358032 0.01316977
|
|
0.01375914 0.01355004 0.01325154 0.0130403 ]
|
|
|
|
mean value: 0.013893747329711914
|
|
|
|
key: test_mcc
|
|
value: [0.28249417 0.18786729 0.28138743 0.31311215 0.32897585 0.3480246
|
|
0.42986904 0.33367758 0.01513623 0.23761484]
|
|
|
|
mean value: 0.2758159171857665
|
|
|
|
key: train_mcc
|
|
value: [0.57695482 0.55611065 0.54898125 0.56711343 0.57342657 0.54204088
|
|
0.58510016 0.57416973 0.58464706 0.58468061]
|
|
|
|
mean value: 0.5693225176661459
|
|
|
|
key: test_accuracy
|
|
value: [0.640625 0.59375 0.640625 0.65625 0.65625 0.671875
|
|
0.71428571 0.66666667 0.50793651 0.61904762]
|
|
|
|
mean value: 0.6367311507936508
|
|
|
|
key: train_accuracy
|
|
value: [0.78846154 0.77797203 0.77447552 0.78321678 0.78671329 0.77097902
|
|
0.79232112 0.78708551 0.79232112 0.79232112]
|
|
|
|
mean value: 0.7845867047437728
|
|
|
|
key: test_fscore
|
|
value: [0.65671642 0.58064516 0.63492063 0.64516129 0.7027027 0.6440678
|
|
0.71875 0.6440678 0.52307692 0.63636364]
|
|
|
|
mean value: 0.6386472359807587
|
|
|
|
key: train_fscore
|
|
value: [0.78956522 0.77522124 0.77328647 0.7883959 0.78671329 0.7729636
|
|
0.78863233 0.78745645 0.79232112 0.79304348]
|
|
|
|
mean value: 0.7847599087821961
|
|
|
|
key: test_precision
|
|
value: [0.62857143 0.6 0.64516129 0.66666667 0.61904762 0.7037037
|
|
0.6969697 0.67857143 0.51515152 0.61764706]
|
|
|
|
mean value: 0.6371490407828169
|
|
|
|
key: train_precision
|
|
value: [0.78546713 0.78494624 0.77738516 0.77 0.78671329 0.76632302
|
|
0.80434783 0.78745645 0.79094077 0.78892734]
|
|
|
|
mean value: 0.784250720863634
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.5625 0.625 0.625 0.8125 0.59375
|
|
0.74193548 0.61290323 0.53125 0.65625 ]
|
|
|
|
mean value: 0.644858870967742
|
|
|
|
key: train_recall
|
|
value: [0.79370629 0.76573427 0.76923077 0.80769231 0.78671329 0.77972028
|
|
0.77351916 0.78745645 0.79370629 0.7972028 ]
|
|
|
|
mean value: 0.7854681903462392
|
|
|
|
key: test_roc_auc
|
|
value: [0.640625 0.59375 0.640625 0.65625 0.65625 0.671875
|
|
0.71471774 0.66582661 0.50756048 0.61844758]
|
|
|
|
mean value: 0.6365927419354839
|
|
|
|
key: train_roc_auc
|
|
value: [0.78846154 0.77797203 0.77447552 0.78321678 0.78671329 0.77097902
|
|
0.79235399 0.78708487 0.79232353 0.79232962]
|
|
|
|
mean value: 0.7845910187373603
|
|
|
|
key: test_jcc
|
|
value: [0.48888889 0.40909091 0.46511628 0.47619048 0.54166667 0.475
|
|
0.56097561 0.475 0.35416667 0.46666667]
|
|
|
|
mean value: 0.4712762162996139
|
|
|
|
key: train_jcc
|
|
value: [0.65229885 0.63294798 0.63037249 0.65070423 0.64841499 0.6299435
|
|
0.65102639 0.64942529 0.65606936 0.65706052]
|
|
|
|
mean value: 0.6458263597269788
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03415418 0.02976036 0.03232479 0.02934623 0.02964926 0.02974415
|
|
0.02997589 0.02972007 0.02973104 0.02964139]
|
|
|
|
mean value: 0.03040473461151123
|
|
|
|
key: score_time
|
|
value: [0.01561642 0.01400876 0.01577091 0.01404595 0.01406598 0.01601458
|
|
0.01414895 0.01402998 0.01415586 0.01414967]
|
|
|
|
mean value: 0.014600706100463868
|
|
|
|
key: test_mcc
|
|
value: [0.32274861 0.40644851 0.59404013 0.44539933 0.52636136 0.51639778
|
|
0.4969666 0.56086231 0.42842742 0.55611985]
|
|
|
|
mean value: 0.48537719052534
|
|
|
|
key: train_mcc
|
|
value: [0.66001142 0.66528266 0.6179077 0.63177564 0.62412608 0.62471669
|
|
0.64145684 0.65479702 0.63778991 0.64216428]
|
|
|
|
mean value: 0.640002823010731
|
|
|
|
key: test_accuracy
|
|
value: [0.65625 0.703125 0.796875 0.71875 0.75 0.75
|
|
0.74603175 0.77777778 0.71428571 0.77777778]
|
|
|
|
mean value: 0.7390873015873016
|
|
|
|
key: train_accuracy
|
|
value: [0.82692308 0.83041958 0.8041958 0.81293706 0.80944056 0.80944056
|
|
0.81849913 0.82373473 0.81849913 0.81849913]
|
|
|
|
mean value: 0.8172588755049488
|
|
|
|
key: test_fscore
|
|
value: [0.69444444 0.6984127 0.79365079 0.74285714 0.78378378 0.77777778
|
|
0.75757576 0.78787879 0.71875 0.78787879]
|
|
|
|
mean value: 0.7543009974259974
|
|
|
|
key: train_fscore
|
|
value: [0.83797054 0.83966942 0.81993569 0.82487725 0.82101806 0.82160393
|
|
0.82894737 0.8363047 0.8225256 0.82894737]
|
|
|
|
mean value: 0.828179992797138
|
|
|
|
key: test_precision
|
|
value: [0.625 0.70967742 0.80645161 0.68421053 0.69047619 0.7
|
|
0.71428571 0.74285714 0.71875 0.76470588]
|
|
|
|
mean value: 0.7156414488545843
|
|
|
|
key: train_precision
|
|
value: [0.78769231 0.79623824 0.75892857 0.77538462 0.77399381 0.77230769
|
|
0.78504673 0.78181818 0.80333333 0.7826087 ]
|
|
|
|
mean value: 0.7817352179152481
|
|
|
|
key: test_recall
|
|
value: [0.78125 0.6875 0.78125 0.8125 0.90625 0.875
|
|
0.80645161 0.83870968 0.71875 0.8125 ]
|
|
|
|
mean value: 0.802016129032258
|
|
|
|
key: train_recall
|
|
value: [0.8951049 0.88811189 0.89160839 0.88111888 0.87412587 0.87762238
|
|
0.87804878 0.8989547 0.84265734 0.88111888]
|
|
|
|
mean value: 0.880847201578909
|
|
|
|
key: test_roc_auc
|
|
value: [0.65625 0.703125 0.796875 0.71875 0.75 0.75
|
|
0.74697581 0.77872984 0.71421371 0.77721774]
|
|
|
|
mean value: 0.7392137096774194
|
|
|
|
key: train_roc_auc
|
|
value: [0.82692308 0.83041958 0.8041958 0.81293706 0.80944056 0.80944056
|
|
0.81839502 0.82360323 0.81854121 0.81860822]
|
|
|
|
mean value: 0.8172504324943349
|
|
|
|
key: test_jcc
|
|
value: [0.53191489 0.53658537 0.65789474 0.59090909 0.64444444 0.63636364
|
|
0.6097561 0.65 0.56097561 0.65 ]
|
|
|
|
mean value: 0.606884387534703
|
|
|
|
key: train_jcc
|
|
value: [0.72112676 0.72364672 0.69482289 0.70194986 0.69637883 0.69722222
|
|
0.70786517 0.71866295 0.69855072 0.70786517]
|
|
|
|
mean value: 0.7068091299886077
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.13041496 2.18040395 2.15946102 2.10226822 2.21306634 2.20666718
|
|
1.96261048 2.13868785 1.92572284 2.1737442 ]
|
|
|
|
mean value: 2.1193047046661375
|
|
|
|
key: score_time
|
|
value: [0.01501131 0.01783228 0.02522731 0.01474667 0.01502776 0.01275539
|
|
0.01284885 0.01510572 0.01248312 0.01251149]
|
|
|
|
mean value: 0.01535499095916748
|
|
|
|
key: test_mcc
|
|
value: [0.28249417 0.25 0.68884672 0.46897905 0.48038446 0.438357
|
|
0.66853948 0.58770161 0.40327957 0.52371369]
|
|
|
|
mean value: 0.4792295749104151
|
|
|
|
key: train_mcc
|
|
value: [0.97212305 0.9652474 0.95804196 0.96159136 0.95862812 0.95142124
|
|
0.92401632 0.95122576 0.93368746 0.96509588]
|
|
|
|
mean value: 0.9541078553036163
|
|
|
|
key: test_accuracy
|
|
value: [0.640625 0.625 0.84375 0.734375 0.734375 0.71875
|
|
0.82539683 0.79365079 0.6984127 0.76190476]
|
|
|
|
mean value: 0.737624007936508
|
|
|
|
key: train_accuracy
|
|
value: [0.98601399 0.98251748 0.97902098 0.98076923 0.97902098 0.97552448
|
|
0.96160558 0.97556719 0.96684119 0.98254799]
|
|
|
|
mean value: 0.9769429087491914
|
|
|
|
key: test_fscore
|
|
value: [0.65671642 0.625 0.83870968 0.73015873 0.76056338 0.70967742
|
|
0.84057971 0.79365079 0.6779661 0.76923077]
|
|
|
|
mean value: 0.7402252999846467
|
|
|
|
key: train_fscore
|
|
value: [0.98611111 0.98269896 0.97902098 0.98086957 0.97938144 0.97586207
|
|
0.96245734 0.97577855 0.96672504 0.98251748]
|
|
|
|
mean value: 0.9771422540448765
|
|
|
|
key: test_precision
|
|
value: [0.62857143 0.625 0.86666667 0.74193548 0.69230769 0.73333333
|
|
0.76315789 0.78125 0.74074074 0.75757576]
|
|
|
|
mean value: 0.7330538997803429
|
|
|
|
key: train_precision
|
|
value: [0.97931034 0.97260274 0.97902098 0.97577855 0.96283784 0.96258503
|
|
0.94314381 0.96907216 0.96842105 0.98251748]
|
|
|
|
mean value: 0.9695289994945384
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.625 0.8125 0.71875 0.84375 0.6875
|
|
0.93548387 0.80645161 0.625 0.78125 ]
|
|
|
|
mean value: 0.7523185483870968
|
|
|
|
key: train_recall
|
|
value: [0.99300699 0.99300699 0.97902098 0.98601399 0.9965035 0.98951049
|
|
0.9825784 0.9825784 0.96503497 0.98251748]
|
|
|
|
mean value: 0.9849772179040471
|
|
|
|
key: test_roc_auc
|
|
value: [0.640625 0.625 0.84375 0.734375 0.734375 0.71875
|
|
0.82711694 0.79385081 0.69959677 0.76159274]
|
|
|
|
mean value: 0.7379032258064516
|
|
|
|
key: train_roc_auc
|
|
value: [0.98601399 0.98251748 0.97902098 0.98076923 0.97902098 0.97552448
|
|
0.96156892 0.97555493 0.96683804 0.98254794]
|
|
|
|
mean value: 0.9769376964498916
|
|
|
|
key: test_jcc
|
|
value: [0.48888889 0.45454545 0.72222222 0.575 0.61363636 0.55
|
|
0.725 0.65789474 0.51282051 0.625 ]
|
|
|
|
mean value: 0.5925008178955548
|
|
|
|
key: train_jcc
|
|
value: [0.97260274 0.96598639 0.95890411 0.96245734 0.95959596 0.95286195
|
|
0.92763158 0.9527027 0.93559322 0.96563574]
|
|
|
|
mean value: 0.9553971735035433
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04582191 0.03966475 0.03780389 0.0354073 0.03452015 0.02935576
|
|
0.03461981 0.0351541 0.04046655 0.03755069]
|
|
|
|
mean value: 0.03703649044036865
|
|
|
|
key: score_time
|
|
value: [0.0110054 0.00905657 0.00910783 0.0090065 0.00911236 0.00901842
|
|
0.00909066 0.00909805 0.00918746 0.0090847 ]
|
|
|
|
mean value: 0.009276795387268066
|
|
|
|
key: test_mcc
|
|
value: [0.75146915 0.46897905 0.71910121 0.790965 0.56360186 0.75146915
|
|
0.65120968 0.77822581 0.49241885 0.52419355]
|
|
|
|
mean value: 0.6491633296175138
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.734375 0.859375 0.890625 0.78125 0.875
|
|
0.82539683 0.88888889 0.74603175 0.76190476]
|
|
|
|
mean value: 0.8237847222222222
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.87878788 0.73015873 0.85714286 0.88135593 0.77419355 0.87096774
|
|
0.82539683 0.88888889 0.75757576 0.76190476]
|
|
|
|
mean value: 0.8226372922381671
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.85294118 0.74193548 0.87096774 0.96296296 0.8 0.9
|
|
0.8125 0.875 0.73529412 0.77419355]
|
|
|
|
mean value: 0.8325795031274158
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90625 0.71875 0.84375 0.8125 0.75 0.84375
|
|
0.83870968 0.90322581 0.78125 0.75 ]
|
|
|
|
mean value: 0.8148185483870968
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.734375 0.859375 0.890625 0.78125 0.875
|
|
0.82560484 0.8891129 0.74546371 0.76209677]
|
|
|
|
mean value: 0.8237903225806452
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.78378378 0.575 0.75 0.78787879 0.63157895 0.77142857
|
|
0.7027027 0.8 0.6097561 0.61538462]
|
|
|
|
mean value: 0.7027513506107858
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15573025 0.14470005 0.14443851 0.14479327 0.14609981 0.14602733
|
|
0.14412546 0.14521909 0.14652395 0.14871454]
|
|
|
|
mean value: 0.146637225151062
|
|
|
|
key: score_time
|
|
value: [0.01821351 0.01839828 0.01836967 0.01827264 0.01836801 0.01833248
|
|
0.01864052 0.01838899 0.01829696 0.02000189]
|
|
|
|
mean value: 0.018528294563293458
|
|
|
|
key: test_mcc
|
|
value: [0.46897905 0.4113018 0.5336001 0.5 0.42333825 0.62994079
|
|
0.51058887 0.49193548 0.42986904 0.52371369]
|
|
|
|
mean value: 0.49232670574184606
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.734375 0.703125 0.765625 0.75 0.703125 0.8125
|
|
0.74603175 0.74603175 0.71428571 0.76190476]
|
|
|
|
mean value: 0.7437003968253968
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.73846154 0.6779661 0.75409836 0.75 0.73972603 0.82352941
|
|
0.77142857 0.74193548 0.70967742 0.76923077]
|
|
|
|
mean value: 0.7476053683859305
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.74074074 0.79310345 0.75 0.65853659 0.77777778
|
|
0.69230769 0.74193548 0.73333333 0.75757576]
|
|
|
|
mean value: 0.7372583546520712
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.71875 0.75 0.84375 0.875
|
|
0.87096774 0.74193548 0.6875 0.78125 ]
|
|
|
|
mean value: 0.7644153225806452
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.734375 0.703125 0.765625 0.75 0.703125 0.8125
|
|
0.74798387 0.74596774 0.71471774 0.76159274]
|
|
|
|
mean value: 0.7439012096774194
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.58536585 0.51282051 0.60526316 0.6 0.58695652 0.7
|
|
0.62790698 0.58974359 0.55 0.625 ]
|
|
|
|
mean value: 0.5983056612600692
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01084328 0.01262331 0.01136661 0.01074314 0.01116204 0.01096058
|
|
0.01086903 0.0107832 0.01119852 0.01064134]
|
|
|
|
mean value: 0.01111910343170166
|
|
|
|
key: score_time
|
|
value: [0.0092206 0.01025367 0.00897503 0.00894499 0.00896931 0.00879312
|
|
0.00883961 0.00905681 0.00896764 0.00889587]
|
|
|
|
mean value: 0.009091663360595702
|
|
|
|
key: test_mcc
|
|
value: [0.19088543 0.28138743 0.25451391 0.2847474 0.13159034 0.34391797
|
|
0.26942496 0.21080523 0.07809475 0.18338233]
|
|
|
|
mean value: 0.2228749746098926
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.59375 0.640625 0.625 0.640625 0.5625 0.671875
|
|
0.63492063 0.6031746 0.53968254 0.58730159]
|
|
|
|
mean value: 0.6099454365079365
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.62857143 0.64615385 0.5862069 0.61016949 0.62162162 0.67692308
|
|
0.62295082 0.62686567 0.56716418 0.65789474]
|
|
|
|
mean value: 0.6244521768607626
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.57894737 0.63636364 0.65384615 0.66666667 0.54761905 0.66666667
|
|
0.63333333 0.58333333 0.54285714 0.56818182]
|
|
|
|
mean value: 0.6077815167288851
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.65625 0.53125 0.5625 0.71875 0.6875
|
|
0.61290323 0.67741935 0.59375 0.78125 ]
|
|
|
|
mean value: 0.6509072580645161
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.59375 0.640625 0.625 0.640625 0.5625 0.671875
|
|
0.63457661 0.60433468 0.53881048 0.58417339]
|
|
|
|
mean value: 0.6096270161290323
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45833333 0.47727273 0.41463415 0.43902439 0.45098039 0.51162791
|
|
0.45238095 0.45652174 0.39583333 0.49019608]
|
|
|
|
mean value: 0.4546804999601126
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.42
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.13523126 2.13666153 2.17790389 2.1243844 2.14154196 2.14696455
|
|
2.11618638 2.12946486 2.18791819 2.1946516 ]
|
|
|
|
mean value: 2.14909086227417
|
|
|
|
key: score_time
|
|
value: [0.09532213 0.09471822 0.1032505 0.09645987 0.10374951 0.09507036
|
|
0.09460235 0.0944438 0.09846973 0.09542656]
|
|
|
|
mean value: 0.0971513032913208
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.65657067 0.75146915 0.78163175 0.75592895 0.68884672
|
|
0.65419917 0.74772995 0.68245968 0.65821474]
|
|
|
|
mean value: 0.7191142340192237
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.828125 0.875 0.890625 0.875 0.84375
|
|
0.82539683 0.87301587 0.84126984 0.82539683]
|
|
|
|
mean value: 0.8583829365079365
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.83076923 0.87096774 0.89230769 0.88235294 0.83870968
|
|
0.83076923 0.875 0.84375 0.84057971]
|
|
|
|
mean value: 0.86142971336133
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.81818182 0.9 0.87878788 0.83333333 0.86666667
|
|
0.79411765 0.84848485 0.84375 0.78378378]
|
|
|
|
mean value: 0.8449458917473623
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.84375 0.84375 0.90625 0.9375 0.8125
|
|
0.87096774 0.90322581 0.84375 0.90625 ]
|
|
|
|
mean value: 0.8805443548387096
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.828125 0.875 0.890625 0.875 0.84375
|
|
0.82610887 0.8734879 0.84122984 0.82409274]
|
|
|
|
mean value: 0.858366935483871
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.71052632 0.77142857 0.80555556 0.78947368 0.72222222
|
|
0.71052632 0.77777778 0.72972973 0.725 ]
|
|
|
|
mean value: 0.7575573505836664
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.97181416 1.06746793 1.03298688 0.99328613 1.07194424 1.03769469
|
|
1.07363486 1.10688138 0.99139166 1.02051616]
|
|
|
|
mean value: 1.0367618083953858
|
|
|
|
key: score_time
|
|
value: [0.19423556 0.18731165 0.27005219 0.29324675 0.28291416 0.22762942
|
|
0.26725793 0.29875374 0.31482267 0.28389359]
|
|
|
|
mean value: 0.2620117664337158
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.6875 0.78163175 0.78163175 0.78470603 0.75
|
|
0.72407013 0.71471774 0.71705182 0.68740835]
|
|
|
|
mean value: 0.744280914258336
|
|
|
|
key: train_mcc
|
|
value: [0.90989218 0.90653139 0.90626497 0.90604313 0.92418486 0.90964713
|
|
0.91004139 0.91658347 0.90621124 0.90262515]
|
|
|
|
mean value: 0.9098024895604151
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.84375 0.890625 0.890625 0.890625 0.875
|
|
0.85714286 0.85714286 0.85714286 0.84126984]
|
|
|
|
mean value: 0.8709573412698413
|
|
|
|
key: train_accuracy
|
|
value: [0.95454545 0.9527972 0.9527972 0.9527972 0.96153846 0.95454545
|
|
0.95462478 0.95811518 0.95287958 0.95113438]
|
|
|
|
mean value: 0.9545774905722549
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.84375 0.88888889 0.89230769 0.89552239 0.875
|
|
0.86567164 0.85714286 0.86567164 0.85294118]
|
|
|
|
mean value: 0.8745987195542726
|
|
|
|
key: train_fscore
|
|
value: [0.95547945 0.95384615 0.95368782 0.9535284 0.96245734 0.95532646
|
|
0.9556314 0.95876289 0.9535284 0.95172414]
|
|
|
|
mean value: 0.95539724483478
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.84375 0.90322581 0.87878788 0.85714286 0.875
|
|
0.80555556 0.84375 0.82857143 0.80555556]
|
|
|
|
mean value: 0.8523692023241359
|
|
|
|
key: train_precision /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
value: [0.93624161 0.93311037 0.93602694 0.93898305 0.94 0.93918919
|
|
0.93645485 0.94576271 0.93898305 0.93877551]
|
|
|
|
mean value: 0.9383527277109088
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.84375 0.875 0.90625 0.9375 0.875
|
|
0.93548387 0.87096774 0.90625 0.90625 ]
|
|
|
|
mean value: 0.8993951612903226
|
|
|
|
key: train_recall
|
|
value: [0.97552448 0.97552448 0.97202797 0.96853147 0.98601399 0.97202797
|
|
0.97560976 0.97212544 0.96853147 0.96503497]
|
|
|
|
mean value: 0.9730951974854414
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.84375 0.890625 0.890625 0.890625 0.875
|
|
0.85836694 0.85735887 0.85635081 0.84022177]
|
|
|
|
mean value: 0.8709173387096774
|
|
|
|
key: train_roc_auc
|
|
value: [0.95454545 0.9527972 0.9527972 0.9527972 0.96153846 0.95454545
|
|
0.95458809 0.95809069 0.95290685 0.9511586 ]
|
|
|
|
mean value: 0.9545765210399356
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.72972973 0.8 0.80555556 0.81081081 0.77777778
|
|
0.76315789 0.75 0.76315789 0.74358974]
|
|
|
|
mean value: 0.7777112740270635
|
|
|
|
key: train_jcc
|
|
value: [0.9147541 0.91176471 0.91147541 0.91118421 0.92763158 0.91447368
|
|
0.91503268 0.92079208 0.91118421 0.90789474]
|
|
|
|
mean value: 0.9146187394078189
|
|
|
|
MCC on Blind test: 0.64
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02556682 0.01088738 0.01087403 0.01076269 0.01079369 0.01202536
|
|
0.0111661 0.01087999 0.01215863 0.01092458]
|
|
|
|
mean value: 0.01260392665863037
|
|
|
|
key: score_time
|
|
value: [0.01301289 0.00922108 0.00939584 0.00923276 0.00920796 0.00946498
|
|
0.00923038 0.00919342 0.00994778 0.00926733]
|
|
|
|
mean value: 0.009717440605163575
|
|
|
|
key: test_mcc
|
|
value: [0.31814238 0.34527065 0.44539933 0.34391797 0.42333825 0.62622429
|
|
0.36629686 0.42842742 0.33367758 0.49193548]
|
|
|
|
mean value: 0.4122630210919585
|
|
|
|
key: train_mcc
|
|
value: [0.53152212 0.51229647 0.48732947 0.48309663 0.4795166 0.51083262
|
|
0.46627502 0.46655817 0.5205467 0.5054141 ]
|
|
|
|
mean value: 0.4963387918137142
|
|
|
|
key: test_accuracy
|
|
value: [0.65625 0.671875 0.71875 0.671875 0.703125 0.8125
|
|
0.68253968 0.71428571 0.66666667 0.74603175]
|
|
|
|
mean value: 0.704389880952381
|
|
|
|
key: train_accuracy
|
|
value: [0.76398601 0.75524476 0.74300699 0.74125874 0.73951049 0.7534965
|
|
0.73298429 0.73298429 0.7591623 0.7521815 ]
|
|
|
|
mean value: 0.7473815887428453
|
|
|
|
key: test_fscore
|
|
value: [0.68571429 0.6557377 0.68965517 0.67692308 0.73972603 0.81818182
|
|
0.6875 0.70967742 0.68656716 0.75 ]
|
|
|
|
mean value: 0.709968266908221
|
|
|
|
key: train_fscore
|
|
value: [0.7768595 0.76510067 0.75210793 0.74744027 0.74529915 0.76771005
|
|
0.73846154 0.74023769 0.76923077 0.75932203]
|
|
|
|
mean value: 0.7561769601426575
|
|
|
|
key: test_precision
|
|
value: [0.63157895 0.68965517 0.76923077 0.66666667 0.65853659 0.79411765
|
|
0.66666667 0.70967742 0.65714286 0.75 ]
|
|
|
|
mean value: 0.6993272731268689
|
|
|
|
key: train_precision
|
|
value: [0.73667712 0.73548387 0.72638436 0.73 0.72909699 0.7258567
|
|
0.72483221 0.7218543 0.73717949 0.73684211]
|
|
|
|
mean value: 0.7304207151405426
|
|
|
|
key: test_recall
|
|
value: [0.75 0.625 0.625 0.6875 0.84375 0.84375
|
|
0.70967742 0.70967742 0.71875 0.75 ]
|
|
|
|
mean value: 0.7263104838709677
|
|
|
|
key: train_recall
|
|
value: [0.82167832 0.7972028 0.77972028 0.76573427 0.76223776 0.81468531
|
|
0.75261324 0.75958188 0.8041958 0.78321678]
|
|
|
|
mean value: 0.7840866450622548
|
|
|
|
key: test_roc_auc
|
|
value: [0.65625 0.671875 0.71875 0.671875 0.703125 0.8125
|
|
0.68296371 0.71421371 0.66582661 0.74596774]
|
|
|
|
mean value: 0.7043346774193548
|
|
|
|
key: train_roc_auc
|
|
value: [0.76398601 0.75524476 0.74300699 0.74125874 0.73951049 0.7534965
|
|
0.73294998 0.73293779 0.75924076 0.75223557]
|
|
|
|
mean value: 0.7473867595818815
|
|
|
|
key: test_jcc
|
|
value: [0.52173913 0.48780488 0.52631579 0.51162791 0.58695652 0.69230769
|
|
0.52380952 0.55 0.52272727 0.6 ]
|
|
|
|
mean value: 0.5523288715517611
|
|
|
|
key: train_jcc
|
|
value: [0.63513514 0.61956522 0.6027027 0.59673025 0.59400545 0.62299465
|
|
0.58536585 0.58760108 0.625 0.61202186]
|
|
|
|
mean value: 0.6081122192207598
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.30241895 0.10555077 0.10573483 0.10488105 0.10599661 0.11018038
|
|
0.10661006 0.10959363 0.10565329 0.14151597]
|
|
|
|
mean value: 0.129813551902771
|
|
|
|
key: score_time
|
|
value: [0.01123381 0.01157808 0.01123309 0.01145458 0.01132345 0.01121831
|
|
0.01126218 0.01123905 0.01134586 0.01125169]
|
|
|
|
mean value: 0.011314010620117188
|
|
|
|
key: test_mcc
|
|
value: [0.875 0.71910121 0.81409158 0.84748251 0.78163175 0.75146915
|
|
0.68865372 0.74772995 0.77822581 0.77800241]
|
|
|
|
mean value: 0.7781388085762072
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.859375 0.90625 0.921875 0.890625 0.875
|
|
0.84126984 0.87301587 0.88888889 0.88888889]
|
|
|
|
mean value: 0.8882688492063492
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.9375 0.86153846 0.90322581 0.91803279 0.89230769 0.87096774
|
|
0.84848485 0.875 0.88888889 0.89230769]
|
|
|
|
mean value: 0.8888253918799927
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.84848485 0.93333333 0.96551724 0.87878788 0.9
|
|
0.8 0.84848485 0.90322581 0.87878788]
|
|
|
|
mean value: 0.8894121835709712
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 0.875 0.875 0.90625 0.84375
|
|
0.90322581 0.90322581 0.875 0.90625 ]
|
|
|
|
mean value: 0.8900201612903226
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.859375 0.90625 0.921875 0.890625 0.875
|
|
0.8422379 0.8734879 0.8891129 0.88860887]
|
|
|
|
mean value: 0.8884072580645161
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88235294 0.75675676 0.82352941 0.84848485 0.80555556 0.77142857
|
|
0.73684211 0.77777778 0.8 0.80555556]
|
|
|
|
mean value: 0.80082835237634
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0486443 0.06402254 0.04328895 0.08116221 0.05702353 0.09169793
|
|
0.07593226 0.04668522 0.07705951 0.04529047]
|
|
|
|
mean value: 0.06308069229125976
|
|
|
|
key: score_time
|
|
value: [0.01902747 0.01237202 0.01238132 0.01268625 0.02459049 0.01931906
|
|
0.01228476 0.01229668 0.01233292 0.02081442]
|
|
|
|
mean value: 0.015810537338256835
|
|
|
|
key: test_mcc
|
|
value: [0.38177086 0.31311215 0.65657067 0.55359617 0.4163332 0.65657067
|
|
0.53874599 0.61982085 0.30272467 0.46146899]
|
|
|
|
mean value: 0.49007142138993015
|
|
|
|
key: train_mcc
|
|
value: [0.77812872 0.80152775 0.7847726 0.76315265 0.76780686 0.76042114
|
|
0.79618159 0.78218049 0.76695074 0.8069451 ]
|
|
|
|
mean value: 0.780806763235838
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.65625 0.828125 0.765625 0.703125 0.828125
|
|
0.76190476 0.80952381 0.65079365 0.73015873]
|
|
|
|
mean value: 0.7421130952380952
|
|
|
|
key: train_accuracy
|
|
value: [0.88811189 0.90034965 0.89160839 0.88111888 0.88286713 0.87937063
|
|
0.89703316 0.89005236 0.88307155 0.90226876]
|
|
|
|
mean value: 0.8895852402396905
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.64516129 0.83076923 0.79452055 0.73239437 0.82539683
|
|
0.7826087 0.8 0.64516129 0.74626866]
|
|
|
|
mean value: 0.7516566617607912
|
|
|
|
key: train_fscore
|
|
value: [0.89189189 0.9025641 0.89491525 0.88395904 0.88701518 0.88324873
|
|
0.90084034 0.89411765 0.88547009 0.90572391]
|
|
|
|
mean value: 0.8929746175479386
|
|
|
|
key: test_precision
|
|
value: [0.65789474 0.66666667 0.81818182 0.70731707 0.66666667 0.83870968
|
|
0.71052632 0.82758621 0.66666667 0.71428571]
|
|
|
|
mean value: 0.727450154258575
|
|
|
|
key: train_precision
|
|
value: [0.8627451 0.88294314 0.86842105 0.86333333 0.85667752 0.8557377
|
|
0.87012987 0.86363636 0.86622074 0.87337662]
|
|
|
|
mean value: 0.8663221450093648
|
|
|
|
key: test_recall
|
|
value: [0.78125 0.625 0.84375 0.90625 0.8125 0.8125
|
|
0.87096774 0.77419355 0.625 0.78125 ]
|
|
|
|
mean value: 0.783266129032258
|
|
|
|
key: train_recall
|
|
value: [0.92307692 0.92307692 0.92307692 0.90559441 0.91958042 0.91258741
|
|
0.93379791 0.92682927 0.90559441 0.94055944]
|
|
|
|
mean value: 0.9213774030847202
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.65625 0.828125 0.765625 0.703125 0.828125
|
|
0.76360887 0.80897177 0.65120968 0.72933468]
|
|
|
|
mean value: 0.7421875
|
|
|
|
key: train_roc_auc
|
|
value: [0.88811189 0.90034965 0.89160839 0.88111888 0.88286713 0.87937063
|
|
0.89696888 0.88998806 0.88311079 0.90233547]
|
|
|
|
mean value: 0.8895829779976121
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.47619048 0.71052632 0.65909091 0.57777778 0.7027027
|
|
0.64285714 0.66666667 0.47619048 0.5952381 ]
|
|
|
|
mean value: 0.6062796118059276
|
|
|
|
key: train_jcc
|
|
value: [0.80487805 0.82242991 0.80981595 0.79204893 0.7969697 0.79090909
|
|
0.81957187 0.80851064 0.79447853 0.82769231]
|
|
|
|
mean value: 0.8067304962826153
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02168322 0.01041508 0.01006579 0.01009798 0.0101316 0.01007843
|
|
0.00995684 0.01002717 0.01006675 0.01005673]
|
|
|
|
mean value: 0.01125795841217041
|
|
|
|
key: score_time
|
|
value: [0.01007843 0.00911808 0.00869632 0.00869465 0.00870037 0.00874281
|
|
0.00876069 0.00872588 0.00868654 0.0087564 ]
|
|
|
|
mean value: 0.00889601707458496
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.40644851 0.34527065 0.25197632 0.54443572 0.5336001
|
|
0.46743768 0.4969666 0.5892604 0.58728587]
|
|
|
|
mean value: 0.4739079632497716
|
|
|
|
key: train_mcc
|
|
value: [0.51458239 0.49592887 0.46744423 0.48759325 0.52117846 0.51787859
|
|
0.48968865 0.48599733 0.52422453 0.50849873]
|
|
|
|
mean value: 0.5013015026995373
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.703125 0.671875 0.625 0.765625 0.765625
|
|
0.73015873 0.74603175 0.79365079 0.79365079]
|
|
|
|
mean value: 0.7344742063492063
|
|
|
|
key: train_accuracy
|
|
value: [0.75524476 0.7465035 0.73251748 0.74300699 0.75874126 0.75699301
|
|
0.7434555 0.7417103 0.7609075 0.7521815 ]
|
|
|
|
mean value: 0.7491261792308913
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.70769231 0.6557377 0.64705882 0.78873239 0.7761194
|
|
0.74626866 0.75757576 0.80597015 0.8 ]
|
|
|
|
mean value: 0.7462932974814709
|
|
|
|
key: train_fscore
|
|
value: [0.76973684 0.75953566 0.74542429 0.75294118 0.77227723 0.77100494
|
|
0.75702479 0.75496689 0.77128548 0.76644737]
|
|
|
|
mean value: 0.7620644661560988
|
|
|
|
key: test_precision
|
|
value: [0.7 0.6969697 0.68965517 0.61111111 0.71794872 0.74285714
|
|
0.69444444 0.71428571 0.77142857 0.78787879]
|
|
|
|
mean value: 0.712657935933798
|
|
|
|
key: train_precision
|
|
value: [0.72670807 0.72239748 0.71111111 0.72491909 0.73125 0.72897196
|
|
0.72012579 0.7192429 0.73801917 0.72360248]
|
|
|
|
mean value: 0.7246348060626768
|
|
|
|
key: test_recall
|
|
value: [0.875 0.71875 0.625 0.6875 0.875 0.8125
|
|
0.80645161 0.80645161 0.84375 0.8125 ]
|
|
|
|
mean value: 0.7862903225806451
|
|
|
|
key: train_recall
|
|
value: [0.81818182 0.8006993 0.78321678 0.78321678 0.81818182 0.81818182
|
|
0.79790941 0.79442509 0.80769231 0.81468531]
|
|
|
|
mean value: 0.8036390438829464
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.703125 0.671875 0.625 0.765625 0.765625
|
|
0.73135081 0.74697581 0.79284274 0.79334677]
|
|
|
|
mean value: 0.7345766129032258
|
|
|
|
key: train_roc_auc
|
|
value: [0.75524476 0.7465035 0.73251748 0.74300699 0.75874126 0.75699301
|
|
0.7433603 0.74161814 0.76098901 0.75229039]
|
|
|
|
mean value: 0.7491264832728247
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.54761905 0.48780488 0.47826087 0.65116279 0.63414634
|
|
0.5952381 0.6097561 0.675 0.66666667]
|
|
|
|
mean value: 0.5982018423223509
|
|
|
|
key: train_jcc
|
|
value: [0.62566845 0.61229947 0.59416446 0.60377358 0.62903226 0.62734584
|
|
0.60904255 0.60638298 0.62771739 0.62133333]
|
|
|
|
mean value: 0.6156760314698697
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01659179 0.02303815 0.02188993 0.0252893 0.02217102 0.01938987
|
|
0.02506042 0.02309823 0.02040172 0.02841473]
|
|
|
|
mean value: 0.022534513473510744
|
|
|
|
key: score_time
|
|
value: [0.01002479 0.01128006 0.01193643 0.01192284 0.01185226 0.01186895
|
|
0.01191616 0.01193762 0.01183701 0.01184344]
|
|
|
|
mean value: 0.011641955375671387
|
|
|
|
key: test_mcc
|
|
value: [0.50395263 0.45184806 0.60848698 0.63628476 0.43033148 0.3146266
|
|
0.65419917 0.52679717 0.49193548 0.37005896]
|
|
|
|
mean value: 0.49885212995054273
|
|
|
|
key: train_mcc
|
|
value: [0.64492467 0.70137886 0.66009632 0.68270793 0.48108781 0.36731544
|
|
0.65453838 0.6739828 0.67888209 0.53301232]
|
|
|
|
mean value: 0.6077926631426009
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.71875 0.796875 0.8125 0.65625 0.609375
|
|
0.82539683 0.76190476 0.74603175 0.66666667]
|
|
|
|
mean value: 0.734375
|
|
|
|
key: train_accuracy
|
|
value: [0.81293706 0.84965035 0.82167832 0.83566434 0.69405594 0.61888112
|
|
0.82722513 0.83420593 0.83944154 0.73472949]
|
|
|
|
mean value: 0.786846922710797
|
|
|
|
key: test_fscore
|
|
value: [0.73333333 0.67857143 0.77192982 0.79310345 0.74418605 0.71264368
|
|
0.83076923 0.76923077 0.75 0.58823529]
|
|
|
|
mean value: 0.7372003053532222
|
|
|
|
key: train_fscore
|
|
value: [0.78727634 0.84363636 0.7992126 0.81923077 0.76383266 0.72405063
|
|
0.82901554 0.84451718 0.83916084 0.65137615]
|
|
|
|
mean value: 0.7901309079655531
|
|
|
|
key: test_precision
|
|
value: [0.78571429 0.79166667 0.88 0.88461538 0.59259259 0.56363636
|
|
0.79411765 0.73529412 0.75 0.78947368]
|
|
|
|
mean value: 0.7567110742141702
|
|
|
|
key: train_precision
|
|
value: [0.9124424 0.87878788 0.91441441 0.91025641 0.62197802 0.56746032
|
|
0.82191781 0.7962963 0.83916084 0.94666667]
|
|
|
|
mean value: 0.8209381049553387
|
|
|
|
key: test_recall
|
|
value: [0.6875 0.59375 0.6875 0.71875 1. 0.96875
|
|
0.87096774 0.80645161 0.75 0.46875 ]
|
|
|
|
mean value: 0.755241935483871
|
|
|
|
key: train_recall
|
|
value: [0.69230769 0.81118881 0.70979021 0.74475524 0.98951049 1.
|
|
0.83623693 0.8989547 0.83916084 0.4965035 ]
|
|
|
|
mean value: 0.8018408420847445
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.71875 0.796875 0.8125 0.65625 0.609375
|
|
0.82610887 0.76260081 0.74596774 0.66985887]
|
|
|
|
mean value: 0.7348286290322581
|
|
|
|
key: train_roc_auc
|
|
value: [0.81293706 0.84965035 0.82167832 0.83566434 0.69405594 0.61888112
|
|
0.82720938 0.83409274 0.83944105 0.73431447]
|
|
|
|
mean value: 0.7867924758168661
|
|
|
|
key: test_jcc
|
|
value: [0.57894737 0.51351351 0.62857143 0.65714286 0.59259259 0.55357143
|
|
0.71052632 0.625 0.6 0.41666667]
|
|
|
|
mean value: 0.5876532171269013
|
|
|
|
key: train_jcc
|
|
value: [0.64918033 0.72955975 0.66557377 0.69381107 0.61790393 0.56746032
|
|
0.7079646 0.73087819 0.72289157 0.4829932 ]
|
|
|
|
mean value: 0.656821672158094
|
|
|
|
MCC on Blind test: 0.65
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03064966 0.02935052 0.02632499 0.02779365 0.02897787 0.03325915
|
|
0.0250783 0.03014135 0.02610087 0.0317347 ]
|
|
|
|
mean value: 0.02894110679626465
|
|
|
|
key: score_time
|
|
value: [0.01195979 0.01186347 0.01194692 0.0119195 0.01188731 0.01189685
|
|
0.01187181 0.01184964 0.0119009 0.01189089]
|
|
|
|
mean value: 0.01189870834350586
|
|
|
|
key: test_mcc
|
|
value: [0.45990694 0.32897585 0.65657067 0.52915026 0.40451992 0.4330127
|
|
0.57759945 0.53874599 0.4969666 0.55909213]
|
|
|
|
mean value: 0.49845405061750964
|
|
|
|
key: train_mcc
|
|
value: [0.67615992 0.66717709 0.71805284 0.57509353 0.54781734 0.53942373
|
|
0.62689492 0.67079344 0.68718637 0.7400109 ]
|
|
|
|
mean value: 0.6448610096309454
|
|
|
|
key: test_accuracy
|
|
value: [0.703125 0.65625 0.828125 0.71875 0.6875 0.6875
|
|
0.76190476 0.76190476 0.74603175 0.77777778]
|
|
|
|
mean value: 0.7328869047619048
|
|
|
|
key: train_accuracy
|
|
value: [0.82167832 0.81993007 0.85839161 0.7534965 0.73776224 0.72727273
|
|
0.79581152 0.81849913 0.84293194 0.86736475]
|
|
|
|
mean value: 0.8043138798374401
|
|
|
|
key: test_fscore
|
|
value: [0.75949367 0.7027027 0.83076923 0.7804878 0.61538462 0.75
|
|
0.8 0.7826087 0.73333333 0.79411765]
|
|
|
|
mean value: 0.7548897700665005
|
|
|
|
key: train_fscore
|
|
value: [0.84545455 0.84226646 0.86247878 0.80056577 0.65116279 0.78512397
|
|
0.82511211 0.84337349 0.83754513 0.87458746]
|
|
|
|
mean value: 0.8167670500726047
|
|
|
|
key: test_precision
|
|
value: [0.63829787 0.61904762 0.81818182 0.64 0.8 0.625
|
|
0.68181818 0.71052632 0.78571429 0.75 ]
|
|
|
|
mean value: 0.7068586092891804
|
|
|
|
key: train_precision
|
|
value: [0.7459893 0.7493188 0.83828383 0.67220903 0.97222222 0.64772727
|
|
0.72251309 0.74270557 0.86567164 0.828125 ]
|
|
|
|
mean value: 0.778476575645141
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.8125 0.84375 1. 0.5 0.9375
|
|
0.96774194 0.87096774 0.6875 0.84375 ]
|
|
|
|
mean value: 0.8401209677419355
|
|
|
|
key: train_recall
|
|
value: [0.97552448 0.96153846 0.88811189 0.98951049 0.48951049 0.9965035
|
|
0.96167247 0.97560976 0.81118881 0.92657343]
|
|
|
|
mean value: 0.8975743768426695
|
|
|
|
key: test_roc_auc
|
|
value: [0.703125 0.65625 0.828125 0.71875 0.6875 0.6875
|
|
0.76512097 0.76360887 0.74697581 0.77671371]
|
|
|
|
mean value: 0.733366935483871
|
|
|
|
key: train_roc_auc
|
|
value: [0.82167832 0.81993007 0.85839161 0.7534965 0.73776224 0.72727273
|
|
0.79552155 0.81822446 0.84287664 0.8674679 ]
|
|
|
|
mean value: 0.8042622012134207
|
|
|
|
key: test_jcc
|
|
value: [0.6122449 0.54166667 0.71052632 0.64 0.44444444 0.6
|
|
0.66666667 0.64285714 0.57894737 0.65853659]
|
|
|
|
mean value: 0.6095890088170485
|
|
|
|
key: train_jcc
|
|
value: [0.73228346 0.72751323 0.75820896 0.66745283 0.48275862 0.6462585
|
|
0.70229008 0.72916667 0.72049689 0.7771261 ]
|
|
|
|
mean value: 0.6943555338702959
|
|
|
|
MCC on Blind test: 0.51
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.21230507 0.20174456 0.2023685 0.20934129 0.20010328 0.20097804
|
|
0.20069718 0.20317173 0.20407915 0.20166779]
|
|
|
|
mean value: 0.203645658493042
|
|
|
|
key: score_time
|
|
value: [0.01540875 0.01660895 0.01600695 0.01670003 0.01540923 0.01591396
|
|
0.01552558 0.01626444 0.01585245 0.01715422]
|
|
|
|
mean value: 0.016084456443786622
|
|
|
|
key: test_mcc
|
|
value: [0.72192954 0.68884672 0.72192954 0.84416229 0.65915306 0.59404013
|
|
0.71471774 0.77822581 0.8415746 0.77800241]
|
|
|
|
mean value: 0.7342581846545592
|
|
|
|
key: train_mcc
|
|
value: [0.90218614 0.90604313 0.90911314 0.87489633 0.89519245 0.93027467
|
|
0.89904215 0.90959958 0.91301596 0.891802 ]
|
|
|
|
mean value: 0.9031165549141992
|
|
|
|
key: test_accuracy
|
|
value: [0.859375 0.84375 0.859375 0.921875 0.828125 0.796875
|
|
0.85714286 0.88888889 0.92063492 0.88888889]
|
|
|
|
mean value: 0.8664930555555556
|
|
|
|
key: train_accuracy
|
|
value: [0.95104895 0.9527972 0.95454545 0.93706294 0.94755245 0.96503497
|
|
0.94938918 0.95462478 0.95636998 0.94589878]
|
|
|
|
mean value: 0.9514324680555047
|
|
|
|
key: test_fscore
|
|
value: [0.86567164 0.84848485 0.85245902 0.92307692 0.8358209 0.8
|
|
0.85714286 0.88888889 0.92307692 0.89230769]
|
|
|
|
mean value: 0.8686929686685008
|
|
|
|
key: train_fscore
|
|
value: [0.95138889 0.9535284 0.95438596 0.93835616 0.94791667 0.96539792
|
|
0.95008606 0.95532646 0.95682211 0.94570928]
|
|
|
|
mean value: 0.9518917916081902
|
|
|
|
key: test_precision
|
|
value: [0.82857143 0.82352941 0.89655172 0.90909091 0.8 0.78787879
|
|
0.84375 0.875 0.90909091 0.87878788]
|
|
|
|
mean value: 0.855225104932255
|
|
|
|
key: train_precision
|
|
value: [0.94482759 0.93898305 0.95774648 0.91946309 0.94137931 0.95547945
|
|
0.93877551 0.94237288 0.94539249 0.94736842]
|
|
|
|
mean value: 0.943178826965576
|
|
|
|
key: test_recall
|
|
value: [0.90625 0.875 0.8125 0.9375 0.875 0.8125
|
|
0.87096774 0.90322581 0.9375 0.90625 ]
|
|
|
|
mean value: 0.8836693548387097
|
|
|
|
key: train_recall
|
|
value: [0.95804196 0.96853147 0.95104895 0.95804196 0.95454545 0.97552448
|
|
0.96167247 0.96864111 0.96853147 0.94405594]
|
|
|
|
mean value: 0.9608635267171852
|
|
|
|
key: test_roc_auc
|
|
value: [0.859375 0.84375 0.859375 0.921875 0.828125 0.796875
|
|
0.85735887 0.8891129 0.9203629 0.88860887]
|
|
|
|
mean value: 0.8664818548387097
|
|
|
|
key: train_roc_auc
|
|
value: [0.95104895 0.9527972 0.95454545 0.93706294 0.94755245 0.96503497
|
|
0.94936771 0.95460028 0.95639117 0.94589557]
|
|
|
|
mean value: 0.9514296678930825
|
|
|
|
key: test_jcc
|
|
value: [0.76315789 0.73684211 0.74285714 0.85714286 0.71794872 0.66666667
|
|
0.75 0.8 0.85714286 0.80555556]
|
|
|
|
mean value: 0.7697313797313797
|
|
|
|
key: train_jcc
|
|
value: [0.90728477 0.91118421 0.91275168 0.88387097 0.9009901 0.93311037
|
|
0.90491803 0.91447368 0.91721854 0.89700997]
|
|
|
|
mean value: 0.9082812318056577
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0958612 0.12185383 0.12039638 0.10832477 0.08756161 0.10275364
|
|
0.10701203 0.08953285 0.11769557 0.09121203]
|
|
|
|
mean value: 0.10422039031982422
|
|
|
|
key: score_time
|
|
value: [0.02607036 0.03825665 0.0407958 0.01853371 0.02605748 0.03799462
|
|
0.01898909 0.02665234 0.0244019 0.02416778]
|
|
|
|
mean value: 0.028191971778869628
|
|
|
|
key: test_mcc
|
|
value: [0.875 0.71910121 0.8125 0.84748251 0.62622429 0.75146915
|
|
0.74596774 0.68352185 0.74722285 0.74596774]
|
|
|
|
mean value: 0.7554457343130832
|
|
|
|
key: train_mcc
|
|
value: [0.9688217 0.98257154 0.98951654 0.97911675 0.98266766 0.9860381
|
|
0.97905753 0.97229095 0.98255382 0.97908113]
|
|
|
|
mean value: 0.9801715709042963
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.859375 0.90625 0.921875 0.8125 0.875
|
|
0.87301587 0.84126984 0.87301587 0.87301587]
|
|
|
|
mean value: 0.877281746031746
|
|
|
|
key: train_accuracy
|
|
value: [0.98426573 0.99125874 0.99475524 0.98951049 0.99125874 0.99300699
|
|
0.9895288 0.98603839 0.991274 0.9895288 ]
|
|
|
|
mean value: 0.9900425926603937
|
|
|
|
key: test_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[0.9375 0.86153846 0.90625 0.91803279 0.81818182 0.87096774
|
|
0.87096774 0.83333333 0.87878788 0.875 ]
|
|
|
|
mean value: 0.8770559762597705
|
|
|
|
key: train_fscore
|
|
value: [0.9840708 0.99121265 0.99474606 0.98943662 0.99118166 0.99298246
|
|
0.98954704 0.98591549 0.99124343 0.98947368]
|
|
|
|
mean value: 0.9899809891560609
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.84848485 0.90625 0.96551724 0.79411765 0.9
|
|
0.87096774 0.86206897 0.85294118 0.875 ]
|
|
|
|
mean value: 0.8812847620846296
|
|
|
|
key: train_precision
|
|
value: [0.99641577 0.99646643 0.99649123 0.9964539 1. 0.99647887
|
|
0.98954704 0.99644128 0.99298246 0.99295775]
|
|
|
|
mean value: 0.9954234725809098
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 0.90625 0.875 0.84375 0.84375
|
|
0.87096774 0.80645161 0.90625 0.875 ]
|
|
|
|
mean value: 0.873991935483871
|
|
|
|
key: train_recall
|
|
value: [0.97202797 0.98601399 0.99300699 0.98251748 0.98251748 0.98951049
|
|
0.98954704 0.97560976 0.98951049 0.98601399]
|
|
|
|
mean value: 0.9846275675543968
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.859375 0.90625 0.921875 0.8125 0.875
|
|
0.87298387 0.84072581 0.87247984 0.87298387]
|
|
|
|
mean value: 0.8771673387096774
|
|
|
|
key: train_roc_auc
|
|
value: [0.98426573 0.99125874 0.99475524 0.98951049 0.99125874 0.99300699
|
|
0.98952876 0.98605663 0.99127092 0.98952267]
|
|
|
|
mean value: 0.9900434930922736
|
|
|
|
key: test_jcc
|
|
value: [0.88235294 0.75675676 0.82857143 0.84848485 0.69230769 0.77142857
|
|
0.77142857 0.71428571 0.78378378 0.77777778]
|
|
|
|
mean value: 0.7827178086001616
|
|
|
|
key: train_jcc
|
|
value: [0.96864111 0.9825784 0.98954704 0.97909408 0.98251748 0.98606272
|
|
0.97931034 0.97222222 0.98263889 0.97916667]
|
|
|
|
mean value: 0.9801778950070581
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14886832 0.21103811 0.20516992 0.16590166 0.21451283 0.21358299
|
|
0.2202301 0.22097421 0.24722981 0.2264185 ]
|
|
|
|
mean value: 0.20739264488220216
|
|
|
|
key: score_time
|
|
value: [0.01676941 0.02704263 0.02702403 0.01622701 0.03246331 0.03205514
|
|
0.03158545 0.03284216 0.0324614 0.03306985]
|
|
|
|
mean value: 0.02815403938293457
|
|
|
|
key: test_mcc
|
|
value: [0.37573457 0.3125 0.4113018 0.34527065 0.4163332 0.40644851
|
|
0.53874599 0.5026181 0.23790323 0.46014151]
|
|
|
|
mean value: 0.4006997549871454
|
|
|
|
key: train_mcc
|
|
value: [0.93788887 0.93425771 0.93485309 0.93425771 0.93152879 0.92389053
|
|
0.93163919 0.93753513 0.93437548 0.9447324 ]
|
|
|
|
mean value: 0.9344958888426667
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.65625 0.703125 0.671875 0.703125 0.703125
|
|
0.76190476 0.74603175 0.61904762 0.73015873]
|
|
|
|
mean value: 0.6982142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.96853147 0.96678322 0.96678322 0.96678322 0.96503497 0.96153846
|
|
0.96509599 0.96858639 0.96684119 0.97207679]
|
|
|
|
mean value: 0.9668054894494685
|
|
|
|
key: test_fscore
|
|
value: [0.67741935 0.65625 0.6779661 0.68656716 0.73239437 0.70769231
|
|
0.7826087 0.76470588 0.625 0.73846154]
|
|
|
|
mean value: 0.7049065411068873
|
|
|
|
key: train_fscore
|
|
value: [0.96917808 0.96740995 0.96763203 0.96740995 0.96598639 0.96232877
|
|
0.96610169 0.96907216 0.96740995 0.97250859]
|
|
|
|
mean value: 0.9675037567685204
|
|
|
|
key: test_precision
|
|
value: [0.7 0.65625 0.74074074 0.65714286 0.66666667 0.6969697
|
|
0.71052632 0.7027027 0.625 0.72727273]
|
|
|
|
mean value: 0.6883271707284865
|
|
|
|
key: train_precision
|
|
value: [0.94966443 0.94949495 0.94352159 0.94949495 0.94039735 0.94295302
|
|
0.94059406 0.9559322 0.94949495 0.95608108]
|
|
|
|
mean value: 0.9477628587703893
|
|
|
|
key: test_recall
|
|
value: [0.65625 0.65625 0.625 0.71875 0.8125 0.71875
|
|
0.87096774 0.83870968 0.625 0.75 ]
|
|
|
|
mean value: 0.7272177419354838
|
|
|
|
key: train_recall
|
|
value: [0.98951049 0.98601399 0.99300699 0.98601399 0.99300699 0.98251748
|
|
0.99303136 0.9825784 0.98601399 0.98951049]
|
|
|
|
mean value: 0.9881204161691967
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.65625 0.703125 0.671875 0.703125 0.703125
|
|
0.76360887 0.74747984 0.61895161 0.72983871]
|
|
|
|
mean value: 0.6984879032258065
|
|
|
|
key: train_roc_auc
|
|
value: [0.96853147 0.96678322 0.96678322 0.96678322 0.96503497 0.96153846
|
|
0.96504715 0.96856193 0.96687459 0.97210716]
|
|
|
|
mean value: 0.9668045369264882
|
|
|
|
key: test_jcc
|
|
value: [0.51219512 0.48837209 0.51282051 0.52272727 0.57777778 0.54761905
|
|
0.64285714 0.61904762 0.45454545 0.58536585]
|
|
|
|
mean value: 0.546332789602784
|
|
|
|
key: train_jcc
|
|
value: [0.94019934 0.93687708 0.93729373 0.93687708 0.93421053 0.92739274
|
|
0.93442623 0.94 0.93687708 0.94648829]
|
|
|
|
mean value: 0.9370642083569285
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.86300206 0.8570118 0.84512067 0.83157301 0.83923888 0.83239603
|
|
0.83465123 0.83852315 0.84167743 0.8304162 ]
|
|
|
|
mean value: 0.8413610458374023
|
|
|
|
key: score_time
|
|
value: [0.00991249 0.01026297 0.00953531 0.00981331 0.0096705 0.00957942
|
|
0.00936604 0.01012897 0.00972867 0.00952649]
|
|
|
|
mean value: 0.009752416610717773
|
|
|
|
key: test_mcc
|
|
value: [0.84416229 0.78163175 0.8125 0.81409158 0.81409158 0.75
|
|
0.77822581 0.84173387 0.8415746 0.74722285]
|
|
|
|
mean value: 0.802523432013684
|
|
|
|
key: train_mcc
|
|
value: [0.95806538 0.97553044 0.9688217 0.97557815 0.96853739 0.98601399
|
|
0.96859238 0.96863997 0.96511897 0.96530633]
|
|
|
|
mean value: 0.9700204697320085
|
|
|
|
key: test_accuracy
|
|
value: [0.921875 0.890625 0.90625 0.90625 0.90625 0.875
|
|
0.88888889 0.92063492 0.92063492 0.87301587]
|
|
|
|
mean value: 0.9009424603174603
|
|
|
|
key: train_accuracy
|
|
value: [0.97902098 0.98776224 0.98426573 0.98776224 0.98426573 0.99300699
|
|
0.98429319 0.98429319 0.98254799 0.98254799]
|
|
|
|
mean value: 0.9849766289556866
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.89230769 0.90625 0.90322581 0.90909091 0.875
|
|
0.88888889 0.92063492 0.92307692 0.87878788]
|
|
|
|
mean value: 0.9020339942315748
|
|
|
|
key: train_fscore
|
|
value: [0.97894737 0.9877836 0.9840708 0.98769772 0.98423818 0.99300699
|
|
0.98429319 0.98423818 0.98245614 0.98233216]
|
|
|
|
mean value: 0.9849064315104781
|
|
|
|
key: test_precision
|
|
value: [0.90909091 0.87878788 0.90625 0.93333333 0.88235294 0.875
|
|
0.875 0.90625 0.90909091 0.85294118]
|
|
|
|
mean value: 0.8928097147950089
|
|
|
|
key: train_precision
|
|
value: [0.98239437 0.98606272 0.99641577 0.99293286 0.98596491 0.99300699
|
|
0.98601399 0.98943662 0.98591549 0.99285714]
|
|
|
|
mean value: 0.989100086360223
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.90625 0.90625 0.875 0.9375 0.875
|
|
0.90322581 0.93548387 0.9375 0.90625 ]
|
|
|
|
mean value: 0.9119959677419355
|
|
|
|
key: train_recall
|
|
value: [0.97552448 0.98951049 0.97202797 0.98251748 0.98251748 0.99300699
|
|
0.9825784 0.97909408 0.97902098 0.97202797]
|
|
|
|
mean value: 0.9807826320021442
|
|
|
|
key: test_roc_auc
|
|
value: [0.921875 0.890625 0.90625 0.90625 0.90625 0.875
|
|
0.8891129 0.92086694 0.9203629 0.87247984]
|
|
|
|
mean value: 0.9009072580645161
|
|
|
|
key: train_roc_auc
|
|
value: [0.97902098 0.98776224 0.98426573 0.98776224 0.98426573 0.99300699
|
|
0.98429619 0.98430228 0.98254185 0.98252967]
|
|
|
|
mean value: 0.9849753904631954
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.80555556 0.82857143 0.82352941 0.83333333 0.77777778
|
|
0.8 0.85294118 0.85714286 0.78378378]
|
|
|
|
mean value: 0.8219778181542887
|
|
|
|
key: train_jcc
|
|
value: [0.95876289 0.97586207 0.96864111 0.97569444 0.96896552 0.98611111
|
|
0.96907216 0.96896552 0.96551724 0.96527778]
|
|
|
|
mean value: 0.970286984468989
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03439713 0.03367257 0.0339005 0.03395748 0.03364205 0.03365755
|
|
0.03340864 0.04170251 0.04242253 0.05480218]
|
|
|
|
mean value: 0.03755631446838379
|
|
|
|
key: score_time
|
|
value: [0.01224303 0.01276779 0.01289511 0.01367021 0.0135715 0.01362062
|
|
0.01876259 0.01285124 0.01362205 0.02431679]
|
|
|
|
mean value: 0.014832091331481934
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0.05006262 0.18898224 0.17466675 -0.12598816 0.05006262
|
|
0.12961896 0.04490133 -0.12607181 0.06339049]
|
|
|
|
mean value: 0.044962502731329346
|
|
|
|
key: train_mcc
|
|
value: [0.27050089 0.23526698 0.26676028 0.23937689 0.26298076 0.25139019
|
|
0.23129033 0.24763842 0.27712658 0.24677557]
|
|
|
|
mean value: 0.2529106884487375
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.515625 0.5625 0.546875 0.484375 0.515625
|
|
0.53968254 0.50793651 0.47619048 0.52380952]
|
|
|
|
mean value: 0.5172619047619047
|
|
|
|
key: train_accuracy
|
|
value: [0.56818182 0.55244755 0.56643357 0.5541958 0.56468531 0.55944056
|
|
0.55148342 0.55846422 0.57068063 0.55671902]
|
|
|
|
mean value: 0.5602731910323533
|
|
|
|
key: test_fscore
|
|
value: [0.61904762 0.65168539 0.68181818 0.68131868 0.65263158 0.65168539
|
|
0.65882353 0.64367816 0.63736264 0.66666667]
|
|
|
|
mean value: 0.6544717842009313
|
|
|
|
key: train_fscore
|
|
value: [0.6984127 0.69082126 0.69756098 0.69165659 0.69671133 0.69417476
|
|
0.69073406 0.69407497 0.6992665 0.69249395]
|
|
|
|
mean value: 0.6945907080600471
|
|
|
|
key: test_precision
|
|
value: [0.5 0.50877193 0.53571429 0.52542373 0.49206349 0.50877193
|
|
0.51851852 0.5 0.49152542 0.51724138]
|
|
|
|
mean value: 0.5098030687798137
|
|
|
|
key: train_precision
|
|
value: [0.53658537 0.52767528 0.53558052 0.52865065 0.53457944 0.53159851
|
|
0.52757353 0.53148148 0.53759398 0.52962963]
|
|
|
|
mean value: 0.5320948391649859
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.90625 0.9375 0.96875 0.96875 0.90625
|
|
0.90322581 0.90322581 0.90625 0.9375 ]
|
|
|
|
mean value: 0.9150201612903226
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.515625 0.5625 0.546875 0.484375 0.515625
|
|
0.5453629 0.5141129 0.46925403 0.5171371 ]
|
|
|
|
mean value: 0.5170866935483871
|
|
|
|
key: train_roc_auc
|
|
value: [0.56818182 0.55244755 0.56643357 0.5541958 0.56468531 0.55944056
|
|
0.5506993 0.55769231 0.57142857 0.55749129]
|
|
|
|
mean value: 0.5602696084403401
|
|
|
|
key: test_jcc
|
|
value: [0.44827586 0.48333333 0.51724138 0.51666667 0.484375 0.48333333
|
|
0.49122807 0.47457627 0.46774194 0.5 ]
|
|
|
|
mean value: 0.4866771851558394
|
|
|
|
key: train_jcc
|
|
value: [0.53658537 0.52767528 0.53558052 0.52865065 0.53457944 0.53159851
|
|
0.52757353 0.53148148 0.53759398 0.52962963]
|
|
|
|
mean value: 0.5320948391649859
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03693748 0.03959489 0.03556442 0.02844191 0.04038453 0.0297482
|
|
0.04064775 0.04152107 0.0279665 0.03697538]
|
|
|
|
mean value: 0.03577821254730225
|
|
|
|
key: score_time
|
|
value: [0.01513863 0.02490067 0.02957106 0.0205822 0.02573943 0.01919627
|
|
0.01912403 0.01920748 0.0150106 0.01629949]
|
|
|
|
mean value: 0.020476984977722167
|
|
|
|
key: test_mcc
|
|
value: [0.42333825 0.53150959 0.65657067 0.69991324 0.60848698 0.62622429
|
|
0.63159952 0.5892604 0.46014151 0.52371369]
|
|
|
|
mean value: 0.5750758136813815
|
|
|
|
key: train_mcc
|
|
value: [0.73686479 0.754191 0.73078337 0.72905754 0.73833893 0.74434091
|
|
0.7571405 0.74253082 0.72556417 0.75465049]
|
|
|
|
mean value: 0.7413462525930328
|
|
|
|
key: test_accuracy
|
|
value: [0.703125 0.765625 0.828125 0.84375 0.796875 0.8125
|
|
0.80952381 0.79365079 0.73015873 0.76190476]
|
|
|
|
mean value: 0.7845238095238095
|
|
|
|
key: train_accuracy
|
|
value: [0.86713287 0.87587413 0.86363636 0.86363636 0.86713287 0.87062937
|
|
0.87783595 0.86910995 0.86212914 0.87609075]
|
|
|
|
mean value: 0.8693207752108275
|
|
|
|
key: test_fscore
|
|
value: [0.73972603 0.76190476 0.83076923 0.85714286 0.81690141 0.81818182
|
|
0.82352941 0.77966102 0.73846154 0.76923077]
|
|
|
|
mean value: 0.7935508840252798
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:176: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:179: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.87248322 0.88067227 0.87 0.86824324 0.87375415 0.87625418
|
|
0.88175676 0.87603306 0.86587436 0.88067227]
|
|
|
|
mean value: 0.8745743513896477
|
|
|
|
key: test_precision
|
|
value: [0.65853659 0.77419355 0.81818182 0.78947368 0.74358974 0.79411765
|
|
0.75675676 0.82142857 0.72727273 0.75757576]
|
|
|
|
mean value: 0.7641126839827675
|
|
|
|
key: train_precision
|
|
value: [0.83870968 0.84789644 0.83121019 0.83986928 0.83227848 0.83974359
|
|
0.8557377 0.83333333 0.84158416 0.84789644]
|
|
|
|
mean value: 0.8408259297230265
|
|
|
|
key: test_recall
|
|
value: [0.84375 0.75 0.84375 0.9375 0.90625 0.84375
|
|
0.90322581 0.74193548 0.75 0.78125 ]
|
|
|
|
mean value: 0.830141129032258
|
|
|
|
key: train_recall
|
|
value: [0.90909091 0.91608392 0.91258741 0.8986014 0.91958042 0.91608392
|
|
0.90940767 0.92334495 0.89160839 0.91608392]
|
|
|
|
mean value: 0.9112472892960698
|
|
|
|
key: test_roc_auc
|
|
value: [0.703125 0.765625 0.828125 0.84375 0.796875 0.8125
|
|
0.8109879 0.79284274 0.72983871 0.76159274]
|
|
|
|
mean value: 0.7845262096774194
|
|
|
|
key: train_roc_auc
|
|
value: [0.86713287 0.87587413 0.86363636 0.86363636 0.86713287 0.87062937
|
|
0.87778076 0.86901513 0.8621805 0.87616042]
|
|
|
|
mean value: 0.8693178772447065
|
|
|
|
key: test_jcc
|
|
value: [0.58695652 0.61538462 0.71052632 0.75 0.69047619 0.69230769
|
|
0.7 0.63888889 0.58536585 0.625 ]
|
|
|
|
mean value: 0.6594906078244528
|
|
|
|
key: train_jcc
|
|
value: [0.77380952 0.78678679 0.7699115 0.76716418 0.77581121 0.7797619
|
|
0.78851964 0.77941176 0.76347305 0.78678679]
|
|
|
|
mean value: 0.777143635117412
|
|
|
|
MCC on Blind test: 0.51
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.26221371 0.30311131 0.30173469 0.37015843 0.35645747 0.3390913
|
|
0.31258631 0.33007884 0.3021841 0.35456944]
|
|
|
|
mean value: 0.323218560218811
|
|
|
|
key: score_time
|
|
value: [0.0190413 0.0189321 0.0212183 0.02102017 0.02358246 0.01976538
|
|
0.01957774 0.04501891 0.02539778 0.02190781]
|
|
|
|
mean value: 0.0235461950302124
|
|
|
|
key: test_mcc
|
|
value: [0.42333825 0.53150959 0.65657067 0.69991324 0.60848698 0.62622429
|
|
0.63159952 0.5892604 0.46014151 0.52371369]
|
|
|
|
mean value: 0.5750758136813815
|
|
|
|
key: train_mcc
|
|
value: [0.73686479 0.754191 0.73078337 0.72905754 0.73833893 0.74434091
|
|
0.7571405 0.74253082 0.72556417 0.75465049]
|
|
|
|
mean value: 0.7413462525930328
|
|
|
|
key: test_accuracy
|
|
value: [0.703125 0.765625 0.828125 0.84375 0.796875 0.8125
|
|
0.80952381 0.79365079 0.73015873 0.76190476]
|
|
|
|
mean value: 0.7845238095238095
|
|
|
|
key: train_accuracy
|
|
value: [0.86713287 0.87587413 0.86363636 0.86363636 0.86713287 0.87062937
|
|
0.87783595 0.86910995 0.86212914 0.87609075]
|
|
|
|
mean value: 0.8693207752108275
|
|
|
|
key: test_fscore
|
|
value: [0.73972603 0.76190476 0.83076923 0.85714286 0.81690141 0.81818182
|
|
0.82352941 0.77966102 0.73846154 0.76923077]
|
|
|
|
mean value: 0.7935508840252798
|
|
|
|
key: train_fscore
|
|
value: [0.87248322 0.88067227 0.87 0.86824324 0.87375415 0.87625418
|
|
0.88175676 0.87603306 0.86587436 0.88067227]
|
|
|
|
mean value: 0.8745743513896477
|
|
|
|
key: test_precision
|
|
value: [0.65853659 0.77419355 0.81818182 0.78947368 0.74358974 0.79411765
|
|
0.75675676 0.82142857 0.72727273 0.75757576]
|
|
|
|
mean value: 0.7641126839827675
|
|
|
|
key: train_precision
|
|
value: [0.83870968 0.84789644 0.83121019 0.83986928 0.83227848 0.83974359
|
|
0.8557377 0.83333333 0.84158416 0.84789644]
|
|
|
|
mean value: 0.8408259297230265
|
|
|
|
key: test_recall
|
|
value: [0.84375 0.75 0.84375 0.9375 0.90625 0.84375
|
|
0.90322581 0.74193548 0.75 0.78125 ]
|
|
|
|
mean value: 0.830141129032258
|
|
|
|
key: train_recall
|
|
value: [0.90909091 0.91608392 0.91258741 0.8986014 0.91958042 0.91608392
|
|
0.90940767 0.92334495 0.89160839 0.91608392]
|
|
|
|
mean value: 0.9112472892960698
|
|
|
|
key: test_roc_auc
|
|
value: [0.703125 0.765625 0.828125 0.84375 0.796875 0.8125
|
|
0.8109879 0.79284274 0.72983871 0.76159274]
|
|
|
|
mean value: 0.7845262096774194
|
|
|
|
key: train_roc_auc
|
|
value: [0.86713287 0.87587413 0.86363636 0.86363636 0.86713287 0.87062937
|
|
0.87778076 0.86901513 0.8621805 0.87616042]
|
|
|
|
mean value: 0.8693178772447065
|
|
|
|
key: test_jcc
|
|
value: [0.58695652 0.61538462 0.71052632 0.75 0.69047619 0.69230769
|
|
0.7 0.63888889 0.58536585 0.625 ]
|
|
|
|
mean value: 0.6594906078244528
|
|
|
|
key: train_jcc
|
|
value: [0.77380952 0.78678679 0.7699115 0.76716418 0.77581121 0.7797619
|
|
0.78851964 0.77941176 0.76347305 0.78678679]
|
|
|
|
mean value: 0.777143635117412
|
|
|
|
MCC on Blind test: 0.51
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04019094 0.0413723 0.04116201 0.04376554 0.04884338 0.04856467
|
|
0.04278922 0.04961562 0.04909396 0.04108286]
|
|
|
|
mean value: 0.044648051261901855
|
|
|
|
key: score_time
|
|
value: [0.01230788 0.01224995 0.01228285 0.0129385 0.01298714 0.01302695
|
|
0.02017069 0.01323438 0.01313925 0.02139902]
|
|
|
|
mean value: 0.01437366008758545
|
|
|
|
key: test_mcc
|
|
value: [0.6882472 0.4459799 0.50051733 0.56950711 0.63213531 0.54098368
|
|
0.61371748 0.61028941 0.58615222 0.52312769]
|
|
|
|
mean value: 0.571065733859239
|
|
|
|
key: train_mcc
|
|
value: [0.69132428 0.68622226 0.68555338 0.67500864 0.65578747 0.65626006
|
|
0.66354171 0.67035181 0.65177608 0.69416521]
|
|
|
|
mean value: 0.672999088350579
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.71590909 0.75 0.78409091 0.81609195 0.77011494
|
|
0.8045977 0.8045977 0.79310345 0.75862069]
|
|
|
|
mean value: 0.78380355276907
|
|
|
|
key: train_accuracy
|
|
value: [0.84478372 0.84223919 0.84223919 0.83715013 0.82719187 0.82719187
|
|
0.83100381 0.83481576 0.82465057 0.84625159]
|
|
|
|
mean value: 0.8357517677526989
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.74747475 0.74418605 0.79120879 0.81395349 0.77272727
|
|
0.81318681 0.81318681 0.79545455 0.77894737]
|
|
|
|
mean value: 0.7921389716330991
|
|
|
|
key: train_fscore
|
|
value: [0.85012285 0.84766585 0.84653465 0.84079602 0.83292383 0.83374083
|
|
0.83680982 0.83830846 0.83170732 0.85116851]
|
|
|
|
mean value: 0.8409778137794869
|
|
|
|
key: test_precision
|
|
value: [0.8 0.67272727 0.76190476 0.76595745 0.81395349 0.75555556
|
|
0.77083333 0.78723404 0.79545455 0.7254902 ]
|
|
|
|
mean value: 0.7649110642787695
|
|
|
|
key: train_precision
|
|
value: [0.82185273 0.81947743 0.82409639 0.82238443 0.80714286 0.80424528
|
|
0.80997625 0.81995134 0.79859485 0.82380952]
|
|
|
|
mean value: 0.8151531077013614
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.84090909 0.72727273 0.81818182 0.81395349 0.79069767
|
|
0.86046512 0.84090909 0.79545455 0.84090909]
|
|
|
|
mean value: 0.823784355179704
|
|
|
|
key: train_recall
|
|
value: [0.88040712 0.8778626 0.87022901 0.86005089 0.86040609 0.86548223
|
|
0.86548223 0.85750636 0.86768448 0.88040712]
|
|
|
|
mean value: 0.8685518141072835
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.71590909 0.75 0.78409091 0.81606765 0.77034884
|
|
0.80523256 0.80417548 0.79307611 0.75766385]
|
|
|
|
mean value: 0.7837473572938689
|
|
|
|
key: train_roc_auc
|
|
value: [0.84478372 0.84223919 0.84223919 0.83715013 0.82714961 0.82714315
|
|
0.83095995 0.83484455 0.82470518 0.84629493]
|
|
|
|
mean value: 0.8357509590421204
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.59677419 0.59259259 0.65454545 0.68627451 0.62962963
|
|
0.68518519 0.68518519 0.66037736 0.63793103]
|
|
|
|
mean value: 0.6569235884204421
|
|
|
|
key: train_jcc
|
|
value: [0.73931624 0.73560768 0.73390558 0.72532189 0.71368421 0.7148847
|
|
0.71940928 0.72162741 0.71189979 0.74089936]
|
|
|
|
mean value: 0.7256556130104113
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.02642417 0.96141696 0.96252251 1.05650187 0.9084363 1.06816649
|
|
0.99326634 1.05088854 0.93737698 1.2540741 ]
|
|
|
|
mean value: 1.021907424926758
|
|
|
|
key: score_time
|
|
value: [0.01944494 0.0148797 0.01482654 0.01350117 0.01510096 0.01505184
|
|
0.01477361 0.01542497 0.01489115 0.01478815]
|
|
|
|
mean value: 0.015268301963806153
|
|
|
|
key: test_mcc
|
|
value: [0.6846532 0.51343603 0.52286233 0.68252363 0.65696218 0.70301836
|
|
0.78388673 0.70121639 0.67811839 0.56634733]
|
|
|
|
mean value: 0.6493024567830749
|
|
|
|
key: train_mcc
|
|
value: [0.77935545 0.79259146 0.80975034 0.80522165 0.787535 0.76785901
|
|
0.77980298 0.78559375 0.788271 0.81000038]
|
|
|
|
mean value: 0.7905981006091508
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.75 0.76136364 0.84090909 0.82758621 0.85057471
|
|
0.88505747 0.85057471 0.83908046 0.7816092 ]
|
|
|
|
mean value: 0.8227664576802508
|
|
|
|
key: train_accuracy
|
|
value: [0.88931298 0.8956743 0.90458015 0.90203562 0.89326557 0.88310038
|
|
0.88945362 0.89199492 0.89326557 0.9047014 ]
|
|
|
|
mean value: 0.894738450197387
|
|
|
|
key: test_fscore
|
|
value: [0.84782609 0.7755102 0.75862069 0.84444444 0.83146067 0.85393258
|
|
0.89361702 0.85393258 0.84090909 0.79569892]
|
|
|
|
mean value: 0.8295952304751271
|
|
|
|
key: train_fscore
|
|
value: [0.89165629 0.89851485 0.90636704 0.90458488 0.8960396 0.88697789
|
|
0.89219331 0.89519112 0.89655172 0.90636704]
|
|
|
|
mean value: 0.8974443750776682
|
|
|
|
key: test_precision
|
|
value: [0.8125 0.7037037 0.76744186 0.82608696 0.80434783 0.82608696
|
|
0.82352941 0.84444444 0.84090909 0.75510204]
|
|
|
|
mean value: 0.8004152291233823
|
|
|
|
key: train_precision
|
|
value: [0.87317073 0.8746988 0.88970588 0.88164251 0.87439614 0.85952381
|
|
0.8716707 0.86842105 0.86873508 0.88970588]
|
|
|
|
mean value: 0.8751670586803703
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.86363636 0.75 0.86363636 0.86046512 0.88372093
|
|
0.97674419 0.86363636 0.84090909 0.84090909]
|
|
|
|
mean value: 0.8630021141649049
|
|
|
|
key: train_recall
|
|
value: [0.91094148 0.92366412 0.92366412 0.92875318 0.91878173 0.91624365
|
|
0.91370558 0.92366412 0.92620865 0.92366412]
|
|
|
|
mean value: 0.9209290760904664
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.75 0.76136364 0.84090909 0.82795983 0.85095137
|
|
0.88609937 0.85042283 0.8390592 0.78091966]
|
|
|
|
mean value: 0.8228594080338266
|
|
|
|
key: train_roc_auc
|
|
value: [0.88931298 0.8956743 0.90458015 0.90203562 0.8932331 0.88305821
|
|
0.88942277 0.89203511 0.89330737 0.90472546]
|
|
|
|
mean value: 0.894738507640046
|
|
|
|
key: test_jcc
|
|
value: [0.73584906 0.63333333 0.61111111 0.73076923 0.71153846 0.74509804
|
|
0.80769231 0.74509804 0.7254902 0.66071429]
|
|
|
|
mean value: 0.7106694061272307
|
|
|
|
key: train_jcc
|
|
value: [0.80449438 0.81573034 0.82876712 0.82579186 0.81165919 0.79690949
|
|
0.80536913 0.81026786 0.8125 0.82876712]
|
|
|
|
mean value: 0.8140256490638564
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01627254 0.0121696 0.0112617 0.01123834 0.01204562 0.01295519
|
|
0.0131247 0.01328087 0.01170492 0.01157737]
|
|
|
|
mean value: 0.012563085556030274
|
|
|
|
key: score_time
|
|
value: [0.01252651 0.00963211 0.00918579 0.00921011 0.01000023 0.01023221
|
|
0.01029634 0.01028538 0.00933933 0.00943708]
|
|
|
|
mean value: 0.010014510154724121
|
|
|
|
key: test_mcc
|
|
value: [0.34869484 0.32357511 0.47739604 0.50847518 0.41659257 0.54016913
|
|
0.35695404 0.37916452 0.33315711 0.24154334]
|
|
|
|
mean value: 0.39257218826227025
|
|
|
|
key: train_mcc
|
|
value: [0.39046926 0.48153999 0.43592814 0.43606986 0.4002895 0.45727191
|
|
0.45958799 0.46528515 0.44934693 0.45720637]
|
|
|
|
mean value: 0.4432995086452203
|
|
|
|
key: test_accuracy
|
|
value: [0.65909091 0.65909091 0.73863636 0.75 0.70114943 0.77011494
|
|
0.67816092 0.68965517 0.66666667 0.62068966]
|
|
|
|
mean value: 0.6933254963427378
|
|
|
|
key: train_accuracy
|
|
value: [0.68575064 0.74045802 0.71755725 0.71755725 0.69758577 0.72808132
|
|
0.72935197 0.73189327 0.72426938 0.72808132]
|
|
|
|
mean value: 0.7200586179358598
|
|
|
|
key: test_fscore
|
|
value: [0.71698113 0.6875 0.74157303 0.77083333 0.64864865 0.76744186
|
|
0.68181818 0.69662921 0.6741573 0.62068966]
|
|
|
|
mean value: 0.7006272362074963
|
|
|
|
key: train_fscore
|
|
value: [0.72767365 0.74689826 0.72592593 0.72660099 0.67217631 0.7377451
|
|
0.73800738 0.74173807 0.73176761 0.7364532 ]
|
|
|
|
mean value: 0.7284986492626067
|
|
|
|
key: test_precision
|
|
value: [0.61290323 0.63461538 0.73333333 0.71153846 0.77419355 0.76744186
|
|
0.66666667 0.68888889 0.66666667 0.62790698]
|
|
|
|
mean value: 0.6884155013112252
|
|
|
|
key: train_precision
|
|
value: [0.64202335 0.72881356 0.70503597 0.70405728 0.73493976 0.71327014
|
|
0.71599045 0.71462264 0.71153846 0.71360382]
|
|
|
|
mean value: 0.7083895432425341
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.75 0.75 0.84090909 0.55813953 0.76744186
|
|
0.69767442 0.70454545 0.68181818 0.61363636]
|
|
|
|
mean value: 0.7227801268498943
|
|
|
|
key: train_recall
|
|
value: [0.83969466 0.76590331 0.7480916 0.75063613 0.61928934 0.76395939
|
|
0.76142132 0.77099237 0.75318066 0.76081425]
|
|
|
|
mean value: 0.7533983027860658
|
|
|
|
key: test_roc_auc
|
|
value: [0.65909091 0.65909091 0.73863636 0.75 0.69952431 0.77008457
|
|
0.67838266 0.68948203 0.66649049 0.62077167]
|
|
|
|
mean value: 0.6931553911205074
|
|
|
|
key: train_roc_auc
|
|
value: [0.68575064 0.74045802 0.71755725 0.71755725 0.69768538 0.72803568
|
|
0.72931117 0.73194288 0.72430607 0.72812286]
|
|
|
|
mean value: 0.7200727192880485
|
|
|
|
key: test_jcc
|
|
value: [0.55882353 0.52380952 0.58928571 0.62711864 0.48 0.62264151
|
|
0.51724138 0.53448276 0.50847458 0.45 ]
|
|
|
|
mean value: 0.5411877635210982
|
|
|
|
key: train_jcc
|
|
value: [0.57192374 0.5960396 0.56976744 0.57059961 0.50622407 0.58446602
|
|
0.58479532 0.58949416 0.57699805 0.582846 ]
|
|
|
|
mean value: 0.5733154027924497
|
|
|
|
MCC on Blind test: 0.54
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01238036 0.01613355 0.01603031 0.01599598 0.01595688 0.01601386
|
|
0.01603103 0.01612544 0.01610541 0.01639628]
|
|
|
|
mean value: 0.015716910362243652
|
|
|
|
key: score_time
|
|
value: [0.01213884 0.01235485 0.0123291 0.01238704 0.01234341 0.01234961
|
|
0.01233554 0.01234794 0.01232505 0.012326 ]
|
|
|
|
mean value: 0.012323737144470215
|
|
|
|
key: test_mcc
|
|
value: [0.43463356 0.21410373 0.43463356 0.48342972 0.51718675 0.4070455
|
|
0.40221987 0.47273749 0.37964137 0.26461585]
|
|
|
|
mean value: 0.4010247399358443
|
|
|
|
key: train_mcc
|
|
value: [0.47703926 0.49682698 0.44056884 0.47587923 0.46007909 0.46468129
|
|
0.45642617 0.47568224 0.45667135 0.49463369]
|
|
|
|
mean value: 0.4698488137783724
|
|
|
|
key: test_accuracy
|
|
value: [0.71590909 0.60227273 0.71590909 0.73863636 0.75862069 0.70114943
|
|
0.70114943 0.73563218 0.68965517 0.63218391]
|
|
|
|
mean value: 0.6991118077324974
|
|
|
|
key: train_accuracy
|
|
value: [0.73791349 0.7480916 0.72010178 0.73664122 0.72935197 0.73189327
|
|
0.72808132 0.73697586 0.72808132 0.74587039]
|
|
|
|
mean value: 0.7343002221209153
|
|
|
|
key: test_fscore
|
|
value: [0.7311828 0.65346535 0.69879518 0.75789474 0.75294118 0.7173913
|
|
0.69767442 0.72941176 0.7032967 0.65217391]
|
|
|
|
mean value: 0.7094227340267705
|
|
|
|
key: train_fscore
|
|
value: [0.74692875 0.75434243 0.72568579 0.7496977 0.73992674 0.7404674
|
|
0.73316708 0.74725275 0.73383085 0.75845411]
|
|
|
|
mean value: 0.7429753592965128
|
|
|
|
key: test_precision
|
|
value: [0.69387755 0.57894737 0.74358974 0.70588235 0.76190476 0.67346939
|
|
0.69767442 0.75609756 0.68085106 0.625 ]
|
|
|
|
mean value: 0.6917294209042293
|
|
|
|
key: train_precision
|
|
value: [0.72209026 0.73607748 0.71149144 0.71428571 0.71294118 0.71837709
|
|
0.72058824 0.71830986 0.71776156 0.72183908]
|
|
|
|
mean value: 0.7193761896813866
|
|
|
|
key: test_recall
|
|
value: [0.77272727 0.75 0.65909091 0.81818182 0.74418605 0.76744186
|
|
0.69767442 0.70454545 0.72727273 0.68181818]
|
|
|
|
mean value: 0.732293868921776
|
|
|
|
key: train_recall
|
|
value: [0.7735369 0.7735369 0.74045802 0.78880407 0.76903553 0.76395939
|
|
0.74619289 0.77862595 0.75063613 0.79898219]
|
|
|
|
mean value: 0.7683767969930639
|
|
|
|
key: test_roc_auc
|
|
value: [0.71590909 0.60227273 0.71590909 0.73863636 0.75845666 0.70190275
|
|
0.70110994 0.73599366 0.68921776 0.63160677]
|
|
|
|
mean value: 0.6991014799154334
|
|
|
|
key: train_roc_auc
|
|
value: [0.73791349 0.7480916 0.72010178 0.73664122 0.72930148 0.73185247
|
|
0.72805828 0.73702871 0.72810994 0.74593779]
|
|
|
|
mean value: 0.7343036772968574
|
|
|
|
key: test_jcc
|
|
value: [0.57627119 0.48529412 0.53703704 0.61016949 0.60377358 0.55932203
|
|
0.53571429 0.57407407 0.54237288 0.48387097]
|
|
|
|
mean value: 0.5507899660340391
|
|
|
|
key: train_jcc
|
|
value: [0.59607843 0.60557769 0.56947162 0.59961315 0.5872093 0.58789062
|
|
0.57874016 0.59649123 0.57956778 0.61089494]
|
|
|
|
mean value: 0.5911534932157384
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01603746 0.01224732 0.01173258 0.01174664 0.01122117 0.01208568
|
|
0.01260567 0.01258731 0.01110101 0.01235771]
|
|
|
|
mean value: 0.012372255325317383
|
|
|
|
key: score_time
|
|
value: [0.04256272 0.02004099 0.01386952 0.01411438 0.01441169 0.01482582
|
|
0.01483512 0.01415348 0.01409531 0.01477814]
|
|
|
|
mean value: 0.017768716812133788
|
|
|
|
key: test_mcc
|
|
value: [0.34530694 0.37340802 0.18257419 0.41294832 0.51879367 0.35695404
|
|
0.33350951 0.38257713 0.28738215 0.40515647]
|
|
|
|
mean value: 0.3598610447642626
|
|
|
|
key: train_mcc
|
|
value: [0.58731594 0.621592 0.59890351 0.58731594 0.60505532 0.60261756
|
|
0.62543924 0.58154252 0.58881754 0.59249991]
|
|
|
|
mean value: 0.5991099467682587
|
|
|
|
key: test_accuracy
|
|
value: [0.67045455 0.68181818 0.59090909 0.70454545 0.75862069 0.67816092
|
|
0.66666667 0.68965517 0.64367816 0.70114943]
|
|
|
|
mean value: 0.6785658307210032
|
|
|
|
key: train_accuracy
|
|
value: [0.79262087 0.81043257 0.79898219 0.79262087 0.80177891 0.80050826
|
|
0.81194409 0.79034307 0.79415502 0.79542567]
|
|
|
|
mean value: 0.7988811507609339
|
|
|
|
key: test_fscore
|
|
value: [0.69473684 0.71428571 0.57142857 0.72340426 0.76404494 0.68181818
|
|
0.66666667 0.6746988 0.65934066 0.72340426]
|
|
|
|
mean value: 0.6873828885284302
|
|
|
|
key: train_fscore
|
|
value: [0.8009768 0.81490683 0.80445545 0.8009768 0.80882353 0.80783354
|
|
0.81862745 0.79553903 0.79800499 0.80245399]
|
|
|
|
mean value: 0.8052598406238634
|
|
|
|
key: test_precision
|
|
value: [0.64705882 0.64814815 0.6 0.68 0.73913043 0.66666667
|
|
0.65909091 0.71794872 0.63829787 0.68 ]
|
|
|
|
mean value: 0.6676341572506888
|
|
|
|
key: train_precision
|
|
value: [0.76995305 0.7961165 0.78313253 0.76995305 0.78199052 0.78014184
|
|
0.79146919 0.77536232 0.78239609 0.77488152]
|
|
|
|
mean value: 0.7805396621320495
|
|
|
|
key: test_recall
|
|
value: [0.75 0.79545455 0.54545455 0.77272727 0.79069767 0.69767442
|
|
0.6744186 0.63636364 0.68181818 0.77272727]
|
|
|
|
mean value: 0.7117336152219873
|
|
|
|
key: train_recall
|
|
value: [0.8346056 0.8346056 0.82697201 0.8346056 0.83756345 0.83756345
|
|
0.84771574 0.81679389 0.81424936 0.83206107]
|
|
|
|
mean value: 0.8316735769364901
|
|
|
|
key: test_roc_auc
|
|
value: [0.67045455 0.68181818 0.59090909 0.70454545 0.7589852 0.67838266
|
|
0.66675476 0.69027484 0.64323467 0.70031712]
|
|
|
|
mean value: 0.6785676532769556
|
|
|
|
key: train_roc_auc
|
|
value: [0.79262087 0.81043257 0.79898219 0.79262087 0.80173338 0.80046112
|
|
0.81189858 0.79037664 0.79418052 0.79547216]
|
|
|
|
mean value: 0.7988778884282042
|
|
|
|
key: test_jcc
|
|
value: [0.53225806 0.55555556 0.4 0.56666667 0.61818182 0.51724138
|
|
0.5 0.50909091 0.49180328 0.56666667]
|
|
|
|
mean value: 0.5257464338676614
|
|
|
|
key: train_jcc
|
|
value: [0.66802444 0.68763103 0.67287785 0.66802444 0.67901235 0.67761807
|
|
0.69294606 0.66049383 0.66390041 0.67008197]
|
|
|
|
mean value: 0.6740610436778488
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05328178 0.05062747 0.04658628 0.0554626 0.05192924 0.04524851
|
|
0.04780626 0.04653692 0.04967833 0.05186844]
|
|
|
|
mean value: 0.0499025821685791
|
|
|
|
key: score_time
|
|
value: [0.01994634 0.01860738 0.01801848 0.02021909 0.01826525 0.01802874
|
|
0.01815724 0.01966429 0.0205009 0.01840186]
|
|
|
|
mean value: 0.018980956077575682
|
|
|
|
key: test_mcc
|
|
value: [0.66143783 0.38357064 0.48342972 0.53987041 0.65539112 0.57138821
|
|
0.55620192 0.58821234 0.62173301 0.39528559]
|
|
|
|
mean value: 0.5456520787005305
|
|
|
|
key: train_mcc
|
|
value: [0.63985567 0.65908472 0.67484545 0.65836096 0.63563364 0.66781139
|
|
0.64200177 0.6513967 0.63924521 0.65891801]
|
|
|
|
mean value: 0.6527153529678907
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.68181818 0.73863636 0.76136364 0.82758621 0.7816092
|
|
0.77011494 0.79310345 0.8045977 0.68965517]
|
|
|
|
mean value: 0.7666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.81552163 0.82315522 0.83333333 0.82569975 0.81321474 0.82846252
|
|
0.81702668 0.82210928 0.81448539 0.82337992]
|
|
|
|
mean value: 0.8216388449712406
|
|
|
|
key: test_fscore
|
|
value: [0.84 0.7254902 0.75789474 0.78787879 0.82758621 0.79569892
|
|
0.79166667 0.80434783 0.82474227 0.73267327]
|
|
|
|
mean value: 0.7887978880548652
|
|
|
|
key: train_fscore
|
|
value: [0.82961222 0.83893395 0.84533648 0.83748517 0.82807018 0.84284051
|
|
0.83058824 0.83412322 0.82943925 0.83855981]
|
|
|
|
mean value: 0.8354989038165057
|
|
|
|
key: test_precision
|
|
value: [0.75 0.63793103 0.70588235 0.70909091 0.81818182 0.74
|
|
0.71698113 0.77083333 0.75471698 0.64912281]
|
|
|
|
mean value: 0.7252740368255087
|
|
|
|
key: train_precision
|
|
value: [0.77074236 0.77021277 0.78854626 0.78444444 0.76789588 0.77849462
|
|
0.77412281 0.7804878 0.76673866 0.77136752]
|
|
|
|
mean value: 0.7753053120338202
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.84090909 0.81818182 0.88636364 0.8372093 0.86046512
|
|
0.88372093 0.84090909 0.90909091 0.84090909]
|
|
|
|
mean value: 0.86723044397463
|
|
|
|
key: train_recall
|
|
value: [0.89821883 0.92111959 0.91094148 0.89821883 0.89847716 0.91878173
|
|
0.89593909 0.8956743 0.90330789 0.91857506]
|
|
|
|
mean value: 0.9059253949186913
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.68181818 0.73863636 0.76136364 0.82769556 0.78250529
|
|
0.77140592 0.79254757 0.80338266 0.68789641]
|
|
|
|
mean value: 0.7665433403805497
|
|
|
|
key: train_roc_auc
|
|
value: [0.81552163 0.82315522 0.83333333 0.82569975 0.81310626 0.82834761
|
|
0.81692629 0.82220263 0.81459811 0.82350073]
|
|
|
|
mean value: 0.8216391547512949
|
|
|
|
key: test_jcc
|
|
value: [0.72413793 0.56923077 0.61016949 0.65 0.70588235 0.66071429
|
|
0.65517241 0.67272727 0.70175439 0.578125 ]
|
|
|
|
mean value: 0.6527913902931426
|
|
|
|
key: train_jcc
|
|
value: [0.70883534 0.72255489 0.73210634 0.72040816 0.70658683 0.72837022
|
|
0.71026157 0.71544715 0.70858283 0.722 ]
|
|
|
|
mean value: 0.7175153340213286
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.5462265 1.42994976 2.20784569 1.73624086 1.58668661 2.60854244
|
|
2.25105 2.60767412 2.84883237 1.98630857]
|
|
|
|
mean value: 2.1809356927871706
|
|
|
|
key: score_time
|
|
value: [0.01277137 0.01260996 0.0125699 0.01260591 0.01268101 0.01257396
|
|
0.01291466 0.01257181 0.01335096 0.01339889]
|
|
|
|
mean value: 0.012804841995239258
|
|
|
|
key: test_mcc
|
|
value: [0.63900965 0.51745489 0.45883147 0.54772256 0.63444041 0.70301836
|
|
0.67803941 0.75240169 0.67866682 0.56484984]
|
|
|
|
mean value: 0.6174435091168581
|
|
|
|
key: train_mcc
|
|
value: [0.92880129 0.80296278 0.89632742 0.8778626 0.83488556 0.90993013
|
|
0.89733331 0.92697253 0.94487561 0.90880964]
|
|
|
|
mean value: 0.892876085727712
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.73863636 0.72727273 0.77272727 0.81609195 0.85057471
|
|
0.82758621 0.87356322 0.83908046 0.7816092 ]
|
|
|
|
mean value: 0.8045323928944619
|
|
|
|
key: train_accuracy
|
|
value: [0.96437659 0.89821883 0.94783715 0.9389313 0.91740788 0.95425667
|
|
0.94790343 0.96315121 0.97204574 0.95425667]
|
|
|
|
mean value: 0.9458385468700997
|
|
|
|
key: test_fscore
|
|
value: [0.82608696 0.78095238 0.70731707 0.7826087 0.80487805 0.85393258
|
|
0.84536082 0.86746988 0.84444444 0.77647059]
|
|
|
|
mean value: 0.8089521476287256
|
|
|
|
key: train_fscore
|
|
value: [0.96455696 0.90430622 0.94682231 0.9389313 0.91698595 0.95555556
|
|
0.94944513 0.96238651 0.97256858 0.95477387]
|
|
|
|
mean value: 0.9466332383939996
|
|
|
|
key: test_precision
|
|
value: [0.79166667 0.67213115 0.76315789 0.75 0.84615385 0.82608696
|
|
0.75925926 0.92307692 0.82608696 0.80487805]
|
|
|
|
mean value: 0.7962497699258487
|
|
|
|
key: train_precision
|
|
value: [0.95969773 0.85327314 0.96560847 0.9389313 0.92287918 0.93028846
|
|
0.92326139 0.98148148 0.95354523 0.94292804]
|
|
|
|
mean value: 0.9371894417274584
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.93181818 0.65909091 0.81818182 0.76744186 0.88372093
|
|
0.95348837 0.81818182 0.86363636 0.75 ]
|
|
|
|
mean value: 0.8309196617336152
|
|
|
|
key: train_recall
|
|
value: [0.96946565 0.96183206 0.92875318 0.9389313 0.91116751 0.9822335
|
|
0.97715736 0.94402036 0.99236641 0.96692112]
|
|
|
|
mean value: 0.9572848451970396
|
|
|
|
key: test_roc_auc
|
|
value: [0.81818182 0.73863636 0.72727273 0.77272727 0.81553911 0.85095137
|
|
0.82901691 0.87420719 0.83879493 0.78197674]
|
|
|
|
mean value: 0.80473044397463
|
|
|
|
key: train_roc_auc
|
|
value: [0.96437659 0.89821883 0.94783715 0.9389313 0.91741582 0.95422108
|
|
0.94786621 0.96312693 0.97207153 0.95427274]
|
|
|
|
mean value: 0.9458338176980405
|
|
|
|
key: test_jcc
|
|
value: [0.7037037 0.640625 0.54716981 0.64285714 0.67346939 0.74509804
|
|
0.73214286 0.76595745 0.73076923 0.63461538]
|
|
|
|
mean value: 0.6816408004188372
|
|
|
|
key: train_jcc
|
|
value: [0.93154034 0.82532751 0.89901478 0.88489209 0.84669811 0.91489362
|
|
0.90375587 0.9275 0.94660194 0.91346154]
|
|
|
|
mean value: 0.8993685796853914
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06037998 0.04592085 0.0504756 0.05603194 0.04493618 0.04983497
|
|
0.04593444 0.04674339 0.04872298 0.04878926]
|
|
|
|
mean value: 0.04977695941925049
|
|
|
|
key: score_time
|
|
value: [0.00964427 0.00910902 0.00899935 0.00910282 0.00918436 0.00902581
|
|
0.00907683 0.00908375 0.00915027 0.0095067 ]
|
|
|
|
mean value: 0.009188318252563476
|
|
|
|
key: test_mcc
|
|
value: [0.84287052 0.70618882 0.75488987 0.6846532 0.77102073 0.84118687
|
|
0.77102073 0.79334038 0.77359882 0.67811839]
|
|
|
|
mean value: 0.7616888342891726
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92045455 0.85227273 0.875 0.84090909 0.88505747 0.91954023
|
|
0.88505747 0.89655172 0.88505747 0.83908046]
|
|
|
|
mean value: 0.879898119122257
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.91764706 0.85714286 0.88172043 0.84782609 0.88636364 0.92134831
|
|
0.88636364 0.89655172 0.88095238 0.84090909]
|
|
|
|
mean value: 0.8816825216363853
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95121951 0.82978723 0.83673469 0.8125 0.86666667 0.89130435
|
|
0.86666667 0.90697674 0.925 0.84090909]
|
|
|
|
mean value: 0.8727764956369783
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.88636364 0.93181818 0.88636364 0.90697674 0.95348837
|
|
0.90697674 0.88636364 0.84090909 0.84090909]
|
|
|
|
mean value: 0.8926532769556025
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92045455 0.85227273 0.875 0.84090909 0.88530655 0.919926
|
|
0.88530655 0.89667019 0.88557082 0.8390592 ]
|
|
|
|
mean value: 0.8800475687103593
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.84782609 0.75 0.78846154 0.73584906 0.79591837 0.85416667
|
|
0.79591837 0.8125 0.78723404 0.7254902 ]
|
|
|
|
mean value: 0.7893364322014
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17561674 0.17232823 0.17396235 0.17050743 0.1736331 0.17472339
|
|
0.17230892 0.17626953 0.17193913 0.17061377]
|
|
|
|
mean value: 0.17319025993347167
|
|
|
|
key: score_time
|
|
value: [0.02014542 0.02008057 0.02028108 0.01903844 0.01922488 0.01906919
|
|
0.02035928 0.01956439 0.02045226 0.01898384]
|
|
|
|
mean value: 0.019719934463500975
|
|
|
|
key: test_mcc
|
|
value: [0.57551157 0.48038446 0.54772256 0.56950711 0.72689655 0.70301836
|
|
0.67811839 0.67866682 0.65539112 0.61028941]
|
|
|
|
mean value: 0.6225506347313268
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78409091 0.73863636 0.77272727 0.78409091 0.86206897 0.85057471
|
|
0.83908046 0.83908046 0.82758621 0.8045977 ]
|
|
|
|
mean value: 0.810253396029258
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.75268817 0.7826087 0.79120879 0.85365854 0.85393258
|
|
0.8372093 0.84444444 0.82758621 0.81318681]
|
|
|
|
mean value: 0.8156523546612395
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.74509804 0.71428571 0.75 0.76595745 0.8974359 0.82608696
|
|
0.8372093 0.82608696 0.8372093 0.78723404]
|
|
|
|
mean value: 0.7986603657993642
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.86363636 0.79545455 0.81818182 0.81818182 0.81395349 0.88372093
|
|
0.8372093 0.86363636 0.81818182 0.84090909]
|
|
|
|
mean value: 0.8353065539112051
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78409091 0.73863636 0.77272727 0.78409091 0.8615222 0.85095137
|
|
0.8390592 0.83879493 0.82769556 0.80417548]
|
|
|
|
mean value: 0.8101744186046512
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.60344828 0.64285714 0.65454545 0.74468085 0.74509804
|
|
0.72 0.73076923 0.70588235 0.68518519]
|
|
|
|
mean value: 0.6899133199106442
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01173329 0.01363635 0.01289082 0.01202559 0.0125587 0.01259565
|
|
0.0118103 0.01261592 0.01333976 0.01175308]
|
|
|
|
mean value: 0.012495946884155274
|
|
|
|
key: score_time
|
|
value: [0.00911117 0.00918341 0.00908852 0.00974941 0.00897264 0.0091536
|
|
0.00952911 0.00908566 0.0098989 0.00913501]
|
|
|
|
mean value: 0.009290742874145507
|
|
|
|
key: test_mcc
|
|
value: [0.62155249 0.33071891 0.46225016 0.46225016 0.41045404 0.61371748
|
|
0.5404983 0.44952813 0.42577098 0.49974958]
|
|
|
|
mean value: 0.48164902399471576
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.80681818 0.65909091 0.72727273 0.72727273 0.70114943 0.8045977
|
|
0.77011494 0.72413793 0.71264368 0.74712644]
|
|
|
|
mean value: 0.7380224660397074
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82105263 0.7 0.75 0.75 0.72340426 0.81318681
|
|
0.76190476 0.73913043 0.72527473 0.73170732]
|
|
|
|
mean value: 0.7515660939120177
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76470588 0.625 0.69230769 0.69230769 0.66666667 0.77083333
|
|
0.7804878 0.70833333 0.70212766 0.78947368]
|
|
|
|
mean value: 0.7192243748964702
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.79545455 0.81818182 0.81818182 0.79069767 0.86046512
|
|
0.74418605 0.77272727 0.75 0.68181818]
|
|
|
|
mean value: 0.7918076109936575
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.80681818 0.65909091 0.72727273 0.72727273 0.70216702 0.80523256
|
|
0.7698203 0.72357294 0.7122093 0.74788584]
|
|
|
|
mean value: 0.7381342494714588
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.69642857 0.53846154 0.6 0.6 0.56666667 0.68518519
|
|
0.61538462 0.5862069 0.56896552 0.57692308]
|
|
|
|
mean value: 0.6034222067842757
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.66490698 2.55711341 2.59944248 2.58720875 2.57398891 2.7415657
|
|
2.7710402 2.6602118 2.64705348 2.56468678]
|
|
|
|
mean value: 2.6367218494415283
|
|
|
|
key: score_time
|
|
value: [0.09884381 0.09871507 0.10496402 0.10470939 0.10334086 0.10789132
|
|
0.10715008 0.10783505 0.0989809 0.09773517]
|
|
|
|
mean value: 0.1030165672302246
|
|
|
|
key: test_mcc
|
|
value: [0.90909091 0.78582528 0.75488987 0.75488987 0.83923862 0.84118687
|
|
0.87056589 0.81606765 0.79323121 0.86289151]
|
|
|
|
mean value: 0.8227877669037194
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95454545 0.88636364 0.875 0.875 0.91954023 0.91954023
|
|
0.93103448 0.90804598 0.89655172 0.93103448]
|
|
|
|
mean value: 0.9096656217345872
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95454545 0.89583333 0.88172043 0.88172043 0.91764706 0.92134831
|
|
0.93478261 0.90909091 0.8988764 0.93333333]
|
|
|
|
mean value: 0.9128898277138389
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.82692308 0.83673469 0.83673469 0.92857143 0.89130435
|
|
0.87755102 0.90909091 0.88888889 0.91304348]
|
|
|
|
mean value: 0.886338799226998
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.97727273 0.93181818 0.93181818 0.90697674 0.95348837
|
|
1. 0.90909091 0.90909091 0.95454545]
|
|
|
|
mean value: 0.9428646934460888
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95454545 0.88636364 0.875 0.875 0.91939746 0.919926
|
|
0.93181818 0.90803383 0.89640592 0.9307611 ]
|
|
|
|
mean value: 0.9097251585623679
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.91304348 0.81132075 0.78846154 0.78846154 0.84782609 0.85416667
|
|
0.87755102 0.83333333 0.81632653 0.875 ]
|
|
|
|
mean value: 0.8405490947877857
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.64
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.07194018 1.12413049 1.11955142 1.11322117 1.15317225 1.10507035
|
|
1.20874882 1.12016249 1.10679674 1.14551282]
|
|
|
|
mean value: 1.1268306732177735
|
|
|
|
key: score_time
|
|
value: [0.21674871 0.26874399 0.25944543 0.24885511 0.28227925 0.21746516
|
|
0.27740693 0.26276541 0.28292441 0.24287367]
|
|
|
|
mean value: 0.25595080852508545
|
|
|
|
key: test_mcc
|
|
value: [0.91003151 0.8057162 0.73029674 0.77594029 0.81683533 0.77008457
|
|
0.82421385 0.83923862 0.77008457 0.86289151]
|
|
|
|
mean value: 0.8105333182128026
|
|
|
|
key: train_mcc
|
|
value: [0.90897389 0.90882071 0.90609005 0.90897389 0.91388817 0.91140754
|
|
0.90620623 0.91910899 0.90621167 0.91402422]
|
|
|
|
mean value: 0.9103705353608575
|
|
|
|
key: test_accuracy
|
|
value: [0.95454545 0.89772727 0.86363636 0.88636364 0.90804598 0.88505747
|
|
0.90804598 0.91954023 0.88505747 0.93103448]
|
|
|
|
mean value: 0.903905433646813
|
|
|
|
key: train_accuracy
|
|
value: [0.95419847 0.95419847 0.95292621 0.95419847 0.95679797 0.95552732
|
|
0.95298602 0.95933926 0.95298602 0.95679797]
|
|
|
|
mean value: 0.9549956190125157
|
|
|
|
key: test_fscore
|
|
value: [0.95555556 0.90526316 0.86956522 0.89130435 0.9047619 0.88372093
|
|
0.91304348 0.92134831 0.88636364 0.93333333]
|
|
|
|
mean value: 0.9064259876226728
|
|
|
|
key: train_fscore
|
|
value: [0.955 0.95488722 0.95345912 0.955 0.95739348 0.95619524
|
|
0.95357591 0.95989975 0.95345912 0.95739348]
|
|
|
|
mean value: 0.9556263327547102
|
|
|
|
key: test_precision
|
|
value: [0.93478261 0.84313725 0.83333333 0.85416667 0.92682927 0.88372093
|
|
0.85714286 0.91111111 0.88636364 0.91304348]
|
|
|
|
mean value: 0.8843631145001328
|
|
|
|
key: train_precision
|
|
value: [0.93857494 0.94074074 0.94278607 0.93857494 0.94554455 0.94320988
|
|
0.94292804 0.94567901 0.94278607 0.94320988]
|
|
|
|
mean value: 0.9424034116783878
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.97727273 0.90909091 0.93181818 0.88372093 0.88372093
|
|
0.97674419 0.93181818 0.88636364 0.95454545]
|
|
|
|
mean value: 0.9312367864693446
|
|
|
|
key: train_recall
|
|
value: [0.97201018 0.96946565 0.96437659 0.97201018 0.96954315 0.96954315
|
|
0.96446701 0.97455471 0.96437659 0.97201018]
|
|
|
|
mean value: 0.9692357370739205
|
|
|
|
key: test_roc_auc
|
|
value: [0.95454545 0.89772727 0.86363636 0.88636364 0.90776956 0.88504228
|
|
0.90882664 0.91939746 0.88504228 0.9307611 ]
|
|
|
|
mean value: 0.9039112050739958
|
|
|
|
key: train_roc_auc
|
|
value: [0.95419847 0.95419847 0.95292621 0.95419847 0.95678175 0.95550949
|
|
0.95297142 0.95935857 0.95300048 0.95681727]
|
|
|
|
mean value: 0.954996060500381
|
|
|
|
key: test_jcc
|
|
value: [0.91489362 0.82692308 0.76923077 0.80392157 0.82608696 0.79166667
|
|
0.84 0.85416667 0.79591837 0.875 ]
|
|
|
|
mean value: 0.8297807689004585
|
|
|
|
key: train_jcc
|
|
value: [0.9138756 0.91366906 0.91105769 0.9138756 0.91826923 0.91606715
|
|
0.91127098 0.92289157 0.91105769 0.91826923]
|
|
|
|
mean value: 0.915030380283576
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02789688 0.01565957 0.0157845 0.01578856 0.01591611 0.01583362
|
|
0.01590157 0.01752901 0.01769686 0.0173285 ]
|
|
|
|
mean value: 0.017533516883850096
|
|
|
|
key: score_time
|
|
value: [0.01299381 0.01210165 0.0122776 0.01220608 0.01224947 0.01215458
|
|
0.01221299 0.01325798 0.01327944 0.01305866]
|
|
|
|
mean value: 0.012579226493835449
|
|
|
|
key: test_mcc
|
|
value: [0.43463356 0.21410373 0.43463356 0.48342972 0.51718675 0.4070455
|
|
0.40221987 0.47273749 0.37964137 0.26461585]
|
|
|
|
mean value: 0.4010247399358443
|
|
|
|
key: train_mcc
|
|
value: [0.47703926 0.49682698 0.44056884 0.47587923 0.46007909 0.46468129
|
|
0.45642617 0.47568224 0.45667135 0.49463369]
|
|
|
|
mean value: 0.4698488137783724
|
|
|
|
key: test_accuracy
|
|
value: [0.71590909 0.60227273 0.71590909 0.73863636 0.75862069 0.70114943
|
|
0.70114943 0.73563218 0.68965517 0.63218391]
|
|
|
|
mean value: 0.6991118077324974
|
|
|
|
key: train_accuracy
|
|
value: [0.73791349 0.7480916 0.72010178 0.73664122 0.72935197 0.73189327
|
|
0.72808132 0.73697586 0.72808132 0.74587039]
|
|
|
|
mean value: 0.7343002221209153
|
|
|
|
key: test_fscore
|
|
value: [0.7311828 0.65346535 0.69879518 0.75789474 0.75294118 0.7173913
|
|
0.69767442 0.72941176 0.7032967 0.65217391]
|
|
|
|
mean value: 0.7094227340267705
|
|
|
|
key: train_fscore
|
|
value: [0.74692875 0.75434243 0.72568579 0.7496977 0.73992674 0.7404674
|
|
0.73316708 0.74725275 0.73383085 0.75845411]
|
|
|
|
mean value: 0.7429753592965128
|
|
|
|
key: test_precision
|
|
value: [0.69387755 0.57894737 0.74358974 0.70588235 0.76190476 0.67346939
|
|
0.69767442 0.75609756 0.68085106 0.625 ]
|
|
|
|
mean value: 0.6917294209042293
|
|
|
|
key: train_precision
|
|
value: [0.72209026 0.73607748 0.71149144 0.71428571 0.71294118 0.71837709
|
|
0.72058824 0.71830986 0.71776156 0.72183908]
|
|
|
|
mean value: 0.7193761896813866
|
|
|
|
key: test_recall
|
|
value: [0.77272727 0.75 0.65909091 0.81818182 0.74418605 0.76744186
|
|
0.69767442 0.70454545 0.72727273 0.68181818]
|
|
|
|
mean value: 0.732293868921776
|
|
|
|
key: train_recall
|
|
value: [0.7735369 0.7735369 0.74045802 0.78880407 0.76903553 0.76395939
|
|
0.74619289 0.77862595 0.75063613 0.79898219]
|
|
|
|
mean value: 0.7683767969930639
|
|
|
|
key: test_roc_auc
|
|
value: [0.71590909 0.60227273 0.71590909 0.73863636 0.75845666 0.70190275
|
|
0.70110994 0.73599366 0.68921776 0.63160677]
|
|
|
|
mean value: 0.6991014799154334
|
|
|
|
key: train_roc_auc
|
|
value: [0.73791349 0.7480916 0.72010178 0.73664122 0.72930148 0.73185247
|
|
0.72805828 0.73702871 0.72810994 0.74593779]
|
|
|
|
mean value: 0.7343036772968574
|
|
|
|
key: test_jcc
|
|
value: [0.57627119 0.48529412 0.53703704 0.61016949 0.60377358 0.55932203
|
|
0.53571429 0.57407407 0.54237288 0.48387097]
|
|
|
|
mean value: 0.5507899660340391
|
|
|
|
key: train_jcc
|
|
value: [0.59607843 0.60557769 0.56947162 0.59961315 0.5872093 0.58789062
|
|
0.57874016 0.59649123 0.57956778 0.61089494]
|
|
|
|
mean value: 0.5911534932157384
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.1539228 0.13720822 0.14518332 0.12025738 0.13158059 0.13478374
|
|
0.12884831 0.12683439 0.12769008 0.13258982]
|
|
|
|
mean value: 0.13388986587524415
|
|
|
|
key: score_time
|
|
value: [0.01118755 0.01118207 0.0112083 0.01213813 0.01123714 0.01117682
|
|
0.01118374 0.0114069 0.01129794 0.01113939]
|
|
|
|
mean value: 0.01131579875946045
|
|
|
|
key: test_mcc
|
|
value: [0.93205893 0.82589664 0.79730996 0.82158384 0.79334038 0.81702814
|
|
0.87056589 0.83923862 0.81702814 0.86205074]
|
|
|
|
mean value: 0.8376101275955953
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96590909 0.90909091 0.89772727 0.90909091 0.89655172 0.90804598
|
|
0.93103448 0.91954023 0.90804598 0.93103448]
|
|
|
|
mean value: 0.9176071055381401
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.91489362 0.9010989 0.91304348 0.89655172 0.90909091
|
|
0.93478261 0.92134831 0.90697674 0.93181818]
|
|
|
|
mean value: 0.919512172029582
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.97674419 0.86 0.87234043 0.875 0.88636364 0.88888889
|
|
0.87755102 0.91111111 0.92857143 0.93181818]
|
|
|
|
mean value: 0.9008388878739837
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.95454545 0.97727273 0.93181818 0.95454545 0.90697674 0.93023256
|
|
1. 0.93181818 0.88636364 0.93181818]
|
|
|
|
mean value: 0.9405391120507399
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96590909 0.90909091 0.89772727 0.90909091 0.89667019 0.9082981
|
|
0.93181818 0.91939746 0.9082981 0.93102537]
|
|
|
|
mean value: 0.9177325581395349
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.84313725 0.82 0.84 0.8125 0.83333333
|
|
0.87755102 0.85416667 0.82978723 0.87234043]
|
|
|
|
mean value: 0.8516149268217925
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.08155274 0.07502103 0.05325723 0.0756371 0.05765128 0.13894176
|
|
0.07617521 0.08730435 0.1101017 0.07325125]
|
|
|
|
mean value: 0.08288936614990235
|
|
|
|
key: score_time
|
|
value: [0.01919389 0.01230693 0.01228714 0.01243138 0.01909542 0.03655434
|
|
0.01920414 0.02307391 0.01900601 0.01966429]
|
|
|
|
mean value: 0.019281744956970215
|
|
|
|
key: test_mcc
|
|
value: [0.6882472 0.37796447 0.45643546 0.59648091 0.65994555 0.5504913
|
|
0.67803941 0.70984404 0.51803019 0.38309043]
|
|
|
|
mean value: 0.5618568970485822
|
|
|
|
key: train_mcc
|
|
value: [0.7521962 0.74937604 0.79308611 0.76606427 0.77295317 0.75613181
|
|
0.74697508 0.76400062 0.765522 0.76638343]
|
|
|
|
mean value: 0.7632688717792019
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.68181818 0.72727273 0.79545455 0.82758621 0.77011494
|
|
0.82758621 0.85057471 0.75862069 0.68965517]
|
|
|
|
mean value: 0.7769592476489028
|
|
|
|
key: train_accuracy
|
|
value: [0.8740458 0.8740458 0.8956743 0.88167939 0.88564168 0.87674714
|
|
0.87166455 0.88055909 0.88055909 0.88182973]
|
|
|
|
mean value: 0.8802446563268895
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.72 0.73913043 0.80851064 0.83516484 0.78723404
|
|
0.84536082 0.86315789 0.76923077 0.71578947]
|
|
|
|
mean value: 0.7934642742979832
|
|
|
|
key: train_fscore
|
|
value: [0.88029021 0.8776267 0.89901478 0.88644689 0.88943489 0.8818514
|
|
0.87787183 0.88536585 0.88647343 0.88644689]
|
|
|
|
mean value: 0.8850822856062937
|
|
|
|
key: test_precision
|
|
value: [0.8 0.64285714 0.70833333 0.76 0.79166667 0.7254902
|
|
0.75925926 0.80392157 0.74468085 0.66666667]
|
|
|
|
mean value: 0.7402875684552781
|
|
|
|
key: train_precision
|
|
value: [0.83870968 0.85336538 0.87112172 0.85211268 0.86190476 0.84777518
|
|
0.83833718 0.8501171 0.84367816 0.85211268]
|
|
|
|
mean value: 0.8509234509459607
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.81818182 0.77272727 0.86363636 0.88372093 0.86046512
|
|
0.95348837 0.93181818 0.79545455 0.77272727]
|
|
|
|
mean value: 0.8561310782241015
|
|
|
|
key: train_recall
|
|
value: [0.92620865 0.90330789 0.92875318 0.92366412 0.91878173 0.91878173
|
|
0.9213198 0.92366412 0.93384224 0.92366412]
|
|
|
|
mean value: 0.922198757443071
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.68181818 0.72727273 0.79545455 0.8282241 0.77114165
|
|
0.82901691 0.84963002 0.75819239 0.68868922]
|
|
|
|
mean value: 0.7770348837209302
|
|
|
|
key: train_roc_auc
|
|
value: [0.8740458 0.8740458 0.8956743 0.88167939 0.88559951 0.87669366
|
|
0.87160137 0.88061379 0.8806267 0.88188282]
|
|
|
|
mean value: 0.8802463155991269
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.5625 0.5862069 0.67857143 0.71698113 0.64912281
|
|
0.73214286 0.75925926 0.625 0.55737705]
|
|
|
|
mean value: 0.6607902170539353
|
|
|
|
key: train_jcc
|
|
value: [0.78617711 0.78193833 0.81655481 0.79605263 0.80088496 0.78867102
|
|
0.78232759 0.79431072 0.79609544 0.79605263]
|
|
|
|
mean value: 0.7939065237534392
|
|
|
|
MCC on Blind test: 0.51
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01928306 0.01527429 0.01542139 0.01770949 0.01664853 0.01519966
|
|
0.01514888 0.01584053 0.01532054 0.01538372]
|
|
|
|
mean value: 0.016123008728027344
|
|
|
|
key: score_time
|
|
value: [0.01221609 0.01213503 0.01222253 0.01570082 0.01304436 0.01203132
|
|
0.01204228 0.01213694 0.01216316 0.01246881]
|
|
|
|
mean value: 0.012616133689880371
|
|
|
|
key: test_mcc
|
|
value: [0.60092521 0.25819889 0.43192975 0.59648091 0.51718675 0.44820296
|
|
0.42547569 0.42577098 0.38309043 0.33458714]
|
|
|
|
mean value: 0.4421848698943361
|
|
|
|
key: train_mcc
|
|
value: [0.46296406 0.49560803 0.45453146 0.4500587 0.45896356 0.45870907
|
|
0.47255984 0.47612264 0.44425601 0.4758961 ]
|
|
|
|
mean value: 0.4649669477593169
|
|
|
|
key: test_accuracy
|
|
value: [0.79545455 0.625 0.71590909 0.79545455 0.75862069 0.72413793
|
|
0.71264368 0.71264368 0.68965517 0.66666667]
|
|
|
|
mean value: 0.7196185997910136
|
|
|
|
key: train_accuracy
|
|
value: [0.7302799 0.74681934 0.7264631 0.72391858 0.72808132 0.72808132
|
|
0.73570521 0.73697586 0.72172808 0.73697586]
|
|
|
|
mean value: 0.7315028565331678
|
|
|
|
key: test_fscore
|
|
value: [0.8125 0.66666667 0.71264368 0.80851064 0.75294118 0.72093023
|
|
0.71264368 0.72527473 0.71578947 0.68817204]
|
|
|
|
mean value: 0.7316072312284795
|
|
|
|
key: train_fscore
|
|
value: [0.7433414 0.75761267 0.73748474 0.7369697 0.74278846 0.74216867
|
|
0.74509804 0.74848117 0.72929543 0.74786845]
|
|
|
|
mean value: 0.7431108727766951
|
|
|
|
key: test_precision
|
|
value: [0.75 0.6 0.72093023 0.76 0.76190476 0.72093023
|
|
0.70454545 0.70212766 0.66666667 0.65306122]
|
|
|
|
mean value: 0.7040166232297426
|
|
|
|
key: train_precision
|
|
value: [0.70900693 0.72663551 0.70892019 0.7037037 0.70547945 0.70642202
|
|
0.72037915 0.71627907 0.70913462 0.71728972]
|
|
|
|
mean value: 0.7123250356023364
|
|
|
|
key: test_recall
|
|
value: [0.88636364 0.75 0.70454545 0.86363636 0.74418605 0.72093023
|
|
0.72093023 0.75 0.77272727 0.72727273]
|
|
|
|
mean value: 0.7640591966173361
|
|
|
|
key: train_recall
|
|
value: [0.78117048 0.7913486 0.76844784 0.7735369 0.78426396 0.78172589
|
|
0.7715736 0.78371501 0.75063613 0.78117048]
|
|
|
|
mean value: 0.7767588897069271
|
|
|
|
key: test_roc_auc
|
|
value: [0.79545455 0.625 0.71590909 0.79545455 0.75845666 0.72410148
|
|
0.71273784 0.7122093 0.68868922 0.66596195]
|
|
|
|
mean value: 0.7193974630021142
|
|
|
|
key: train_roc_auc
|
|
value: [0.7302799 0.74681934 0.7264631 0.72391858 0.72800984 0.72801307
|
|
0.73565958 0.73703517 0.72176477 0.73703194]
|
|
|
|
mean value: 0.731499528551685
|
|
|
|
key: test_jcc
|
|
value: [0.68421053 0.5 0.55357143 0.67857143 0.60377358 0.56363636
|
|
0.55357143 0.56896552 0.55737705 0.52459016]
|
|
|
|
mean value: 0.5788267490928233
|
|
|
|
key: train_jcc
|
|
value: [0.59152216 0.60980392 0.58413926 0.58349328 0.59082218 0.59003831
|
|
0.59375 0.59805825 0.57392996 0.59727626]
|
|
|
|
mean value: 0.5912833598721492
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02707696 0.02653337 0.02409434 0.02513218 0.03793049 0.03685093
|
|
0.02607822 0.0229454 0.02661228 0.04135823]
|
|
|
|
mean value: 0.029461240768432616
|
|
|
|
key: score_time
|
|
value: [0.01222515 0.01221347 0.01216626 0.012182 0.02011037 0.01222014
|
|
0.01242065 0.01490355 0.01259971 0.01238799]
|
|
|
|
mean value: 0.013342928886413575
|
|
|
|
key: test_mcc
|
|
value: [0.50709255 0.36363636 0.50709255 0.48758163 0.67900591 0.27128229
|
|
0.50648727 0.55996332 0.5606067 0.52749822]
|
|
|
|
mean value: 0.4970246803472478
|
|
|
|
key: train_mcc
|
|
value: [0.54495505 0.62333053 0.4455788 0.5875648 0.71349834 0.45128891
|
|
0.61269854 0.55047307 0.63666649 0.72772028]
|
|
|
|
mean value: 0.58937748073151
|
|
|
|
key: test_accuracy
|
|
value: [0.70454545 0.68181818 0.70454545 0.73863636 0.83908046 0.59770115
|
|
0.74712644 0.75862069 0.77011494 0.75862069]
|
|
|
|
mean value: 0.7300809822361547
|
|
|
|
key: train_accuracy
|
|
value: [0.73536896 0.80788804 0.67048346 0.78371501 0.85260483 0.67217281
|
|
0.79923761 0.74205845 0.79923761 0.85768742]
|
|
|
|
mean value: 0.7720454200089883
|
|
|
|
key: test_fscore
|
|
value: [0.77192982 0.68181818 0.77192982 0.70886076 0.84090909 0.36363636
|
|
0.71052632 0.8 0.8 0.78350515]
|
|
|
|
mean value: 0.7233115515408763
|
|
|
|
key: train_fscore
|
|
value: [0.78861789 0.79172414 0.75072185 0.75146199 0.86320755 0.51685393
|
|
0.77556818 0.79136691 0.82826087 0.86946387]
|
|
|
|
mean value: 0.7727247167420862
|
|
|
|
key: test_precision
|
|
value: [0.62857143 0.68181818 0.62857143 0.8 0.82222222 0.83333333
|
|
0.81818182 0.68852459 0.71428571 0.71698113]
|
|
|
|
mean value: 0.7332489849223534
|
|
|
|
key: train_precision
|
|
value: [0.65651438 0.86445783 0.60371517 0.88316151 0.8061674 0.98571429
|
|
0.88064516 0.6637931 0.72296015 0.80215054]
|
|
|
|
mean value: 0.7869279536805144
|
|
|
|
key: test_recall
|
|
value: [1. 0.68181818 1. 0.63636364 0.86046512 0.23255814
|
|
0.62790698 0.95454545 0.90909091 0.86363636]
|
|
|
|
mean value: 0.7766384778012685
|
|
|
|
key: train_recall
|
|
value: [0.98727735 0.7302799 0.99236641 0.65394402 0.92893401 0.35025381
|
|
0.6928934 0.97964377 0.96946565 0.94910941]
|
|
|
|
mean value: 0.8234167732269023
|
|
|
|
key: test_roc_auc
|
|
value: [0.70454545 0.68181818 0.70454545 0.73863636 0.83932347 0.5935518
|
|
0.74577167 0.75634249 0.76849894 0.75739958]
|
|
|
|
mean value: 0.7290433403805496
|
|
|
|
key: train_roc_auc
|
|
value: [0.73536896 0.80788804 0.67048346 0.78371501 0.85250772 0.67258237
|
|
0.79937291 0.74235995 0.79945364 0.85780344]
|
|
|
|
mean value: 0.7721535500703943
|
|
|
|
key: test_jcc
|
|
value: [0.62857143 0.51724138 0.62857143 0.54901961 0.7254902 0.22222222
|
|
0.55102041 0.66666667 0.66666667 0.6440678 ]
|
|
|
|
mean value: 0.5799537800703761
|
|
|
|
key: train_jcc
|
|
value: [0.65100671 0.65525114 0.6009245 0.60187354 0.7593361 0.34848485
|
|
0.63341067 0.6547619 0.70686456 0.76907216]
|
|
|
|
mean value: 0.6380986143132776
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03708172 0.03154826 0.02611279 0.050071 0.04132271 0.03019166
|
|
0.04098511 0.0399518 0.02971935 0.03148985]
|
|
|
|
mean value: 0.03584742546081543
|
|
|
|
key: score_time
|
|
value: [0.01232719 0.01233745 0.01214862 0.01300573 0.01256013 0.0125978
|
|
0.01266861 0.01263452 0.01259875 0.01262116]
|
|
|
|
mean value: 0.012549996376037598
|
|
|
|
key: test_mcc
|
|
value: [0.56694671 0.54312363 0.47140452 0.60678804 0.63521 0.15516639
|
|
0.5751254 0.50394847 0.5633473 0.36454131]
|
|
|
|
mean value: 0.49856017625519344
|
|
|
|
key: train_mcc
|
|
value: [0.6109196 0.69300077 0.41316998 0.76800614 0.7421714 0.19534962
|
|
0.59538036 0.55886122 0.64229522 0.3612683 ]
|
|
|
|
mean value: 0.5580422606977329
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.73863636 0.68181818 0.79545455 0.81609195 0.52873563
|
|
0.74712644 0.73563218 0.77011494 0.62068966]
|
|
|
|
mean value: 0.7207027168234065
|
|
|
|
key: train_accuracy
|
|
value: [0.78880407 0.8307888 0.6475827 0.8778626 0.8678526 0.53621347
|
|
0.76747141 0.76620076 0.81448539 0.61880559]
|
|
|
|
mean value: 0.7516067392843633
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.78899083 0.75862069 0.81632653 0.82222222 0.08888889
|
|
0.7962963 0.68493151 0.73684211 0.72727273]
|
|
|
|
mean value: 0.6957233898011256
|
|
|
|
key: train_fscore
|
|
value: [0.74772036 0.85271318 0.73892554 0.88785047 0.87619048 0.13711584
|
|
0.80957336 0.72372372 0.79320113 0.72273567]
|
|
|
|
mean value: 0.7289749760328404
|
|
|
|
key: test_precision
|
|
value: [0.875 0.66153846 0.61111111 0.74074074 0.78723404 1.
|
|
0.66153846 0.86206897 0.875 0.57142857]
|
|
|
|
mean value: 0.7645660354427779
|
|
|
|
key: train_precision
|
|
value: [0.92830189 0.75490196 0.58682635 0.82073434 0.82511211 1.
|
|
0.68606702 0.88278388 0.89456869 0.56748911]
|
|
|
|
mean value: 0.7946785350697182
|
|
|
|
key: test_recall
|
|
value: [0.63636364 0.97727273 1. 0.90909091 0.86046512 0.04651163
|
|
1. 0.56818182 0.63636364 1. ]
|
|
|
|
mean value: 0.7634249471458774
|
|
|
|
key: train_recall
|
|
value: [0.6259542 0.97964377 0.99745547 0.96692112 0.93401015 0.07360406
|
|
0.98730964 0.61323155 0.71246819 0.99491094]
|
|
|
|
mean value: 0.78855090995983
|
|
|
|
key: test_roc_auc
|
|
value: [0.77272727 0.73863636 0.68181818 0.79545455 0.81659619 0.52325581
|
|
0.75 0.73757928 0.77167019 0.61627907]
|
|
|
|
mean value: 0.7204016913319239
|
|
|
|
key: train_roc_auc
|
|
value: [0.78880407 0.8307888 0.6475827 0.8778626 0.86776843 0.53680203
|
|
0.76719172 0.76600664 0.81435592 0.61928288]
|
|
|
|
mean value: 0.751644579636016
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.65151515 0.61111111 0.68965517 0.69811321 0.04651163
|
|
0.66153846 0.52083333 0.58333333 0.57142857]
|
|
|
|
mean value: 0.5617373303461235
|
|
|
|
key: train_jcc
|
|
value: [0.59708738 0.74324324 0.58594918 0.79831933 0.77966102 0.07360406
|
|
0.68006993 0.56705882 0.657277 0.5658466 ]
|
|
|
|
mean value: 0.6048116553391599
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.26197743 0.24748087 0.24301672 0.24210167 0.24358678 0.24519014
|
|
0.24583149 0.24984431 0.24517965 0.24366021]
|
|
|
|
mean value: 0.2467869281768799
|
|
|
|
key: score_time
|
|
value: [0.0159831 0.01668954 0.01640892 0.016747 0.01565933 0.01589561
|
|
0.01652861 0.01699233 0.01569104 0.01592636]
|
|
|
|
mean value: 0.01625218391418457
|
|
|
|
key: test_mcc
|
|
value: [0.88843109 0.80064077 0.82158384 0.80064077 0.77077916 0.7951307
|
|
0.75739672 0.79323121 0.7951307 0.79480784]
|
|
|
|
mean value: 0.8017772796644179
|
|
|
|
key: train_mcc
|
|
value: [0.88584325 0.89323388 0.90609005 0.89840499 0.91129031 0.88577908
|
|
0.8764911 0.8810448 0.90347852 0.88328576]
|
|
|
|
mean value: 0.8924941735039372
|
|
|
|
key: test_accuracy
|
|
value: [0.94318182 0.89772727 0.90909091 0.89772727 0.88505747 0.89655172
|
|
0.87356322 0.89655172 0.89655172 0.89655172]
|
|
|
|
mean value: 0.899255485893417
|
|
|
|
key: train_accuracy
|
|
value: [0.94274809 0.94656489 0.95292621 0.94910941 0.95552732 0.94282084
|
|
0.93773825 0.94027954 0.95171537 0.94155019]
|
|
|
|
mean value: 0.9460980112580062
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.90322581 0.91304348 0.90322581 0.88095238 0.8988764
|
|
0.88172043 0.8988764 0.89411765 0.9010989 ]
|
|
|
|
mean value: 0.9016313729958727
|
|
|
|
key: train_fscore
|
|
value: [0.94353827 0.9469697 0.95345912 0.94962217 0.95608532 0.94339623
|
|
0.93928129 0.94117647 0.95189873 0.94206549]
|
|
|
|
mean value: 0.9467492782258208
|
|
|
|
key: test_precision
|
|
value: [0.97560976 0.85714286 0.875 0.85714286 0.90243902 0.86956522
|
|
0.82 0.88888889 0.92682927 0.87234043]
|
|
|
|
mean value: 0.884495829487831
|
|
|
|
key: train_precision
|
|
value: [0.93069307 0.93984962 0.94278607 0.94014963 0.94540943 0.93516209
|
|
0.91767554 0.92610837 0.94710327 0.93266833]
|
|
|
|
mean value: 0.9357605435912151
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.95454545 0.95454545 0.95454545 0.86046512 0.93023256
|
|
0.95348837 0.90909091 0.86363636 0.93181818]
|
|
|
|
mean value: 0.9221458773784356
|
|
|
|
key: train_recall
|
|
value: [0.956743 0.95419847 0.96437659 0.95928753 0.96700508 0.95177665
|
|
0.96192893 0.956743 0.956743 0.95165394]
|
|
|
|
mean value: 0.9580456206972269
|
|
|
|
key: test_roc_auc
|
|
value: [0.94318182 0.89772727 0.90909091 0.89772727 0.88477801 0.89693446
|
|
0.87447146 0.89640592 0.89693446 0.89614165]
|
|
|
|
mean value: 0.8993393234672304
|
|
|
|
key: train_roc_auc
|
|
value: [0.94274809 0.94656489 0.95292621 0.94910941 0.95551272 0.94280944
|
|
0.93770747 0.94030044 0.95172176 0.94156301]
|
|
|
|
mean value: 0.9460963433693701
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.82352941 0.84 0.82352941 0.78723404 0.81632653
|
|
0.78846154 0.81632653 0.80851064 0.82 ]
|
|
|
|
mean value: 0.8212806992955393
|
|
|
|
key: train_jcc
|
|
value: [0.89311164 0.89928058 0.91105769 0.90407674 0.91586538 0.89285714
|
|
0.88551402 0.88888889 0.90821256 0.89047619]
|
|
|
|
mean value: 0.8989340831326912
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.21937823 0.11462379 0.13031578 0.22587872 0.22438908 0.23684406
|
|
0.24717188 0.22145414 0.2242465 0.24388313]
|
|
|
|
mean value: 0.20881853103637696
|
|
|
|
key: score_time
|
|
value: [0.0309608 0.03981233 0.03937268 0.03936577 0.03722668 0.04367852
|
|
0.03383708 0.0447216 0.01878834 0.03858972]
|
|
|
|
mean value: 0.03663535118103027
|
|
|
|
key: test_mcc
|
|
value: [0.91003151 0.82158384 0.75488987 0.77352678 0.79334038 0.81702814
|
|
0.84118687 0.90904296 0.79334038 0.81683533]
|
|
|
|
mean value: 0.8230806070229827
|
|
|
|
key: train_mcc
|
|
value: [0.98728055 0.98987316 0.98987316 0.98728055 0.98229048 0.9847522
|
|
0.9747008 0.99240487 0.98732207 0.99746191]
|
|
|
|
mean value: 0.9873239747455131
|
|
|
|
key: test_accuracy
|
|
value: [0.95454545 0.90909091 0.875 0.88636364 0.89655172 0.90804598
|
|
0.91954023 0.95402299 0.89655172 0.90804598]
|
|
|
|
mean value: 0.9107758620689655
|
|
|
|
key: train_accuracy
|
|
value: [0.99363868 0.99491094 0.99491094 0.99363868 0.99110546 0.99237611
|
|
0.98729352 0.99618806 0.99364676 0.99872935]
|
|
|
|
mean value: 0.9936438499665363
|
|
|
|
key: test_fscore
|
|
value: [0.95348837 0.91304348 0.88172043 0.88888889 0.89655172 0.90909091
|
|
0.92134831 0.95348837 0.89655172 0.91111111]
|
|
|
|
mean value: 0.9125283324527956
|
|
|
|
key: train_fscore
|
|
value: [0.99363057 0.99488491 0.99488491 0.99364676 0.99106003 0.99238579
|
|
0.98721228 0.99616858 0.9936143 0.99872611]
|
|
|
|
mean value: 0.9936214243611737
|
|
|
|
key: test_precision
|
|
value: [0.97619048 0.875 0.83673469 0.86956522 0.88636364 0.88888889
|
|
0.89130435 0.97619048 0.90697674 0.89130435]
|
|
|
|
mean value: 0.8998518828740554
|
|
|
|
key: train_precision
|
|
value: [0.99489796 1. 1. 0.99238579 0.99742931 0.99238579
|
|
0.99484536 1. 0.9974359 1. ]
|
|
|
|
mean value: 0.996938009696097
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.95454545 0.93181818 0.90909091 0.90697674 0.93023256
|
|
0.95348837 0.93181818 0.88636364 0.93181818]
|
|
|
|
mean value: 0.9267970401691332
|
|
|
|
key: train_recall
|
|
value: [0.99236641 0.98982188 0.98982188 0.99491094 0.98477157 0.99238579
|
|
0.97969543 0.99236641 0.98982188 0.99745547]
|
|
|
|
mean value: 0.9903417677374355
|
|
|
|
key: test_roc_auc
|
|
value: [0.95454545 0.90909091 0.875 0.88636364 0.89667019 0.9082981
|
|
0.919926 0.95428118 0.89667019 0.90776956]
|
|
|
|
mean value: 0.9108615221987315
|
|
|
|
key: train_roc_auc
|
|
value: [0.99363868 0.99491094 0.99491094 0.99363868 0.99111352 0.9923761
|
|
0.98730319 0.99618321 0.99364191 0.99872774]
|
|
|
|
mean value: 0.9936444892212708
|
|
|
|
key: test_jcc
|
|
value: [0.91111111 0.84 0.78846154 0.8 0.8125 0.83333333
|
|
0.85416667 0.91111111 0.8125 0.83673469]
|
|
|
|
mean value: 0.8399918454561311
|
|
|
|
key: train_jcc
|
|
value: [0.98734177 0.98982188 0.98982188 0.98737374 0.98227848 0.98488665
|
|
0.97474747 0.99236641 0.98730964 0.99745547]
|
|
|
|
mean value: 0.9873403408684838
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.31891465 0.44751048 0.38999581 0.35558724 0.33960152 0.33973193
|
|
0.40417552 0.29927397 0.2746563 0.30698133]
|
|
|
|
mean value: 0.3476428747177124
|
|
|
|
key: score_time
|
|
value: [0.01942086 0.03408337 0.01951623 0.03345537 0.01954579 0.01944494
|
|
0.034302 0.0343461 0.01945758 0.03698587]
|
|
|
|
mean value: 0.027055811882019044
|
|
|
|
key: test_mcc
|
|
value: [0.58681566 0.39903465 0.54772256 0.50847518 0.68133961 0.61371748
|
|
0.51879367 0.63261064 0.52312769 0.46459728]
|
|
|
|
mean value: 0.547623442308813
|
|
|
|
key: train_mcc
|
|
value: [0.92779891 0.93263014 0.91760331 0.9253913 0.91502105 0.92279174
|
|
0.9198622 0.92253952 0.92789391 0.92012314]
|
|
|
|
mean value: 0.9231655222061333
|
|
|
|
key: test_accuracy
|
|
value: [0.78409091 0.69318182 0.77272727 0.75 0.83908046 0.8045977
|
|
0.75862069 0.81609195 0.75862069 0.72413793]
|
|
|
|
mean value: 0.7701149425287356
|
|
|
|
key: train_accuracy
|
|
value: [0.96310433 0.96564885 0.95801527 0.96183206 0.95679797 0.96060991
|
|
0.95933926 0.96060991 0.96315121 0.95933926]
|
|
|
|
mean value: 0.9608448031142193
|
|
|
|
key: test_fscore
|
|
value: [0.80808081 0.72727273 0.7826087 0.77083333 0.84444444 0.81318681
|
|
0.76404494 0.82222222 0.77894737 0.76 ]
|
|
|
|
mean value: 0.7871641356433801
|
|
|
|
key: train_fscore
|
|
value: [0.96415328 0.96654275 0.9592089 0.96296296 0.95802469 0.96177559
|
|
0.96039604 0.96158612 0.96415328 0.96039604]
|
|
|
|
mean value: 0.9619199642766659
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.65454545 0.75 0.71153846 0.80851064 0.77083333
|
|
0.73913043 0.80434783 0.7254902 0.67857143]
|
|
|
|
mean value: 0.7370240500507275
|
|
|
|
key: train_precision
|
|
value: [0.9375 0.94202899 0.93269231 0.9352518 0.93269231 0.9352518
|
|
0.93719807 0.93719807 0.9375 0.93493976]
|
|
|
|
mean value: 0.9362253092316009
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.81818182 0.81818182 0.84090909 0.88372093 0.86046512
|
|
0.79069767 0.84090909 0.84090909 0.86363636]
|
|
|
|
mean value: 0.8466701902748415
|
|
|
|
key: train_recall
|
|
value: [0.99236641 0.99236641 0.98727735 0.99236641 0.98477157 0.98984772
|
|
0.98477157 0.98727735 0.99236641 0.98727735]
|
|
|
|
mean value: 0.9890688572867826
|
|
|
|
key: test_roc_auc
|
|
value: [0.78409091 0.69318182 0.77272727 0.75 0.83958774 0.80523256
|
|
0.7589852 0.81580338 0.75766385 0.72251586]
|
|
|
|
mean value: 0.7699788583509514
|
|
|
|
key: train_roc_auc
|
|
value: [0.96310433 0.96564885 0.95801527 0.96183206 0.95676238 0.96057271
|
|
0.95930691 0.96064375 0.96318828 0.95937472]
|
|
|
|
mean value: 0.9608449257953269
|
|
|
|
key: test_jcc
|
|
value: [0.6779661 0.57142857 0.64285714 0.62711864 0.73076923 0.68518519
|
|
0.61818182 0.69811321 0.63793103 0.61290323]
|
|
|
|
mean value: 0.6502454162021041
|
|
|
|
key: train_jcc
|
|
value: [0.93078759 0.9352518 0.9216152 0.92857143 0.91943128 0.9263658
|
|
0.92380952 0.92601432 0.93078759 0.92380952]
|
|
|
|
mean value: 0.9266444050803866
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.08145165 1.06318831 1.06463647 1.06263065 1.06868625 1.07411194
|
|
1.07155848 1.07330966 1.06570554 1.06897855]
|
|
|
|
mean value: 1.0694257497787476
|
|
|
|
key: score_time
|
|
value: [0.00957036 0.00953889 0.009655 0.00960064 0.01019549 0.00988364
|
|
0.00950861 0.00972033 0.00976014 0.00954604]
|
|
|
|
mean value: 0.009697914123535156
|
|
|
|
key: test_mcc
|
|
value: [0.93205893 0.86722738 0.81902836 0.80064077 0.83932347 0.79862977
|
|
0.86585804 0.90803383 0.83932347 0.86205074]
|
|
|
|
mean value: 0.8532174747936125
|
|
|
|
key: train_mcc
|
|
value: [0.94910941 0.96438908 0.94912171 0.95165702 0.96190808 0.96190882
|
|
0.96190882 0.9644218 0.95935185 0.96190808]
|
|
|
|
mean value: 0.9585684683799411
|
|
|
|
key: test_accuracy
|
|
value: [0.96590909 0.93181818 0.90909091 0.89772727 0.91954023 0.89655172
|
|
0.93103448 0.95402299 0.91954023 0.93103448]
|
|
|
|
mean value: 0.9256269592476489
|
|
|
|
key: train_accuracy
|
|
value: [0.97455471 0.9821883 0.97455471 0.97582697 0.98094028 0.98094028
|
|
0.98094028 0.98221093 0.97966963 0.98094028]
|
|
|
|
mean value: 0.9792766359189242
|
|
|
|
key: test_fscore
|
|
value: [0.96629213 0.93478261 0.91111111 0.90322581 0.91954023 0.9010989
|
|
0.93333333 0.95454545 0.91954023 0.93181818]
|
|
|
|
mean value: 0.9275287991655823
|
|
|
|
key: train_fscore
|
|
value: [0.97455471 0.98214286 0.9744898 0.97585769 0.98103666 0.98089172
|
|
0.98089172 0.9821883 0.97969543 0.98084291]
|
|
|
|
mean value: 0.9792591788318852
|
|
|
|
key: test_precision
|
|
value: [0.95555556 0.89583333 0.89130435 0.85714286 0.90909091 0.85416667
|
|
0.89361702 0.95454545 0.93023256 0.93181818]
|
|
|
|
mean value: 0.9073306885395176
|
|
|
|
key: train_precision
|
|
value: [0.97455471 0.98465473 0.9769821 0.97461929 0.97732997 0.98465473
|
|
0.98465473 0.9821883 0.97721519 0.98461538]
|
|
|
|
mean value: 0.9801469132744618
|
|
|
|
key: test_recall
|
|
value: [0.97727273 0.97727273 0.93181818 0.95454545 0.93023256 0.95348837
|
|
0.97674419 0.95454545 0.90909091 0.93181818]
|
|
|
|
mean value: 0.9496828752642706
|
|
|
|
key: train_recall
|
|
value: [0.97455471 0.97964377 0.97201018 0.97709924 0.98477157 0.97715736
|
|
0.97715736 0.9821883 0.9821883 0.97709924]
|
|
|
|
mean value: 0.9783870009428967
|
|
|
|
key: test_roc_auc
|
|
value: [0.96590909 0.93181818 0.90909091 0.89772727 0.91966173 0.89719873
|
|
0.93155391 0.95401691 0.91966173 0.93102537]
|
|
|
|
mean value: 0.9257663847780127
|
|
|
|
key: train_roc_auc
|
|
value: [0.97455471 0.9821883 0.97455471 0.97582697 0.98093541 0.98094509
|
|
0.98094509 0.9822109 0.97967283 0.98093541]
|
|
|
|
mean value: 0.9792769403650172
|
|
|
|
key: test_jcc
|
|
value: [0.93478261 0.87755102 0.83673469 0.82352941 0.85106383 0.82
|
|
0.875 0.91304348 0.85106383 0.87234043]
|
|
|
|
mean value: 0.8655109298113325
|
|
|
|
key: train_jcc
|
|
value: [0.95037221 0.96491228 0.95024876 0.9528536 0.96277916 0.9625
|
|
0.9625 0.965 0.960199 0.96240602]
|
|
|
|
mean value: 0.9593771019712535
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03925729 0.0395782 0.04055047 0.04157686 0.04234195 0.0398705
|
|
0.04044795 0.04089832 0.04133368 0.04074621]
|
|
|
|
mean value: 0.04066014289855957
|
|
|
|
key: score_time
|
|
value: [0.01284862 0.01279402 0.0141449 0.01284528 0.01316023 0.01296759
|
|
0.01277375 0.01281953 0.01298618 0.01304054]
|
|
|
|
mean value: 0.013038063049316406
|
|
|
|
key: test_mcc
|
|
value: [-0.10910895 0.09016696 0.20998026 0.03750293 0.21209676 0.15546399
|
|
0.24411022 0.22206651 0.24234093 0.05222823]
|
|
|
|
mean value: 0.13568478555708904
|
|
|
|
key: train_mcc
|
|
value: [0.25503069 0.24932341 0.23759548 0.26064302 0.22239349 0.24365973
|
|
0.23473311 0.23110953 0.25453445 0.24595319]
|
|
|
|
mean value: 0.24349760981536647
|
|
|
|
key: test_accuracy
|
|
value: [0.47727273 0.52272727 0.55681818 0.51136364 0.56321839 0.54022989
|
|
0.55172414 0.55172414 0.57471264 0.51724138]
|
|
|
|
mean value: 0.5367032392894462
|
|
|
|
key: train_accuracy
|
|
value: [0.5610687 0.55852417 0.55343511 0.56361323 0.5476493 0.55654384
|
|
0.55273189 0.5501906 0.56035578 0.55654384]
|
|
|
|
mean value: 0.5560656469150411
|
|
|
|
key: test_fscore
|
|
value: [0.640625 0.66666667 0.688 0.6504065 0.68333333 0.67213115
|
|
0.688 0.69291339 0.69918699 0.66666667]
|
|
|
|
mean value: 0.6747929695969381
|
|
|
|
key: train_fscore
|
|
value: [0.69496021 0.69373345 0.69129288 0.69619132 0.68881119 0.69305189
|
|
0.69122807 0.68947368 0.69434629 0.69251101]
|
|
|
|
mean value: 0.6925599996064771
|
|
|
|
key: test_precision
|
|
value: [0.48809524 0.51219512 0.5308642 0.50632911 0.53246753 0.51898734
|
|
0.52439024 0.53012048 0.5443038 0.51219512]
|
|
|
|
mean value: 0.5199948190990781
|
|
|
|
key: train_precision
|
|
value: [0.53252033 0.53108108 0.52822581 0.53396739 0.52533333 0.53028264
|
|
0.52815013 0.52610442 0.53179973 0.5296496 ]
|
|
|
|
mean value: 0.5297114452098144
|
|
|
|
key: test_recall
|
|
value: [0.93181818 0.95454545 0.97727273 0.90909091 0.95348837 0.95348837
|
|
1. 1. 0.97727273 0.95454545]
|
|
|
|
mean value: 0.9611522198731501
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.47727273 0.52272727 0.55681818 0.51136364 0.56765328 0.544926
|
|
0.55681818 0.54651163 0.57003171 0.51215645]
|
|
|
|
mean value: 0.5366279069767442
|
|
|
|
key: train_roc_auc
|
|
value: [0.5610687 0.55852417 0.55343511 0.56361323 0.54707379 0.55597964
|
|
0.55216285 0.55076142 0.56091371 0.5571066 ]
|
|
|
|
mean value: 0.5560639232249648
|
|
|
|
key: test_jcc
|
|
value: [0.47126437 0.5 0.52439024 0.48192771 0.51898734 0.50617284
|
|
0.52439024 0.53012048 0.5375 0.5 ]
|
|
|
|
mean value: 0.5094753229670379
|
|
|
|
key: train_jcc
|
|
value: [0.53252033 0.53108108 0.52822581 0.53396739 0.52533333 0.53028264
|
|
0.52815013 0.52610442 0.53179973 0.5296496 ]
|
|
|
|
mean value: 0.5297114452098144
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01993704 0.0179441 0.01770711 0.03821087 0.04329848 0.01797175
|
|
0.0178349 0.03925276 0.04720855 0.04208231]
|
|
|
|
mean value: 0.030144786834716795
|
|
|
|
key: score_time
|
|
value: [0.01368213 0.01220298 0.01225615 0.01899743 0.01320052 0.01226377
|
|
0.02703691 0.01902175 0.01903915 0.01890492]
|
|
|
|
mean value: 0.016660571098327637
|
|
|
|
key: test_mcc
|
|
value: [0.6882472 0.42521003 0.54601891 0.5933661 0.67900591 0.54295079
|
|
0.69052856 0.70984404 0.58821234 0.50171077]
|
|
|
|
mean value: 0.5965094644310827
|
|
|
|
key: train_mcc
|
|
value: [0.72661129 0.72075868 0.7378189 0.74120574 0.71121629 0.71197478
|
|
0.72691923 0.74429699 0.72825208 0.74359616]
|
|
|
|
mean value: 0.72926501243974
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.70454545 0.77272727 0.79545455 0.83908046 0.77011494
|
|
0.83908046 0.85057471 0.79310345 0.74712644]
|
|
|
|
mean value: 0.7952716823406478
|
|
|
|
key: train_accuracy
|
|
value: [0.86132316 0.85877863 0.86768448 0.86895674 0.85387548 0.85387548
|
|
0.86149936 0.8703939 0.86149936 0.8703939 ]
|
|
|
|
mean value: 0.8628280486661429
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.74 0.77777778 0.80434783 0.84090909 0.77777778
|
|
0.85106383 0.86315789 0.80434783 0.77083333]
|
|
|
|
mean value: 0.8081279186283203
|
|
|
|
key: train_fscore
|
|
value: [0.86819831 0.86512758 0.87286064 0.87484812 0.86094317 0.86161252
|
|
0.86851628 0.87621359 0.86914766 0.87560976]
|
|
|
|
mean value: 0.8693077616688508
|
|
|
|
key: test_precision
|
|
value: [0.8 0.66071429 0.76086957 0.77083333 0.82222222 0.74468085
|
|
0.78431373 0.80392157 0.77083333 0.71153846]
|
|
|
|
mean value: 0.7629927346540505
|
|
|
|
key: train_precision
|
|
value: [0.82718894 0.82790698 0.84 0.8372093 0.8221709 0.81922197
|
|
0.82758621 0.83758701 0.82272727 0.84074941]
|
|
|
|
mean value: 0.8302347988922448
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.84090909 0.79545455 0.84090909 0.86046512 0.81395349
|
|
0.93023256 0.93181818 0.84090909 0.84090909]
|
|
|
|
mean value: 0.8604651162790697
|
|
|
|
key: train_recall
|
|
value: [0.91348601 0.90585242 0.90839695 0.91603053 0.9035533 0.90862944
|
|
0.91370558 0.91857506 0.92111959 0.91348601]
|
|
|
|
mean value: 0.9122834889758593
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.70454545 0.77272727 0.79545455 0.83932347 0.77061311
|
|
0.84011628 0.84963002 0.79254757 0.74603594]
|
|
|
|
mean value: 0.7951902748414376
|
|
|
|
key: train_roc_auc
|
|
value: [0.86132316 0.85877863 0.86768448 0.86895674 0.85381227 0.85380581
|
|
0.86143294 0.87045504 0.86157502 0.87044859]
|
|
|
|
mean value: 0.8628272690871985
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.58730159 0.63636364 0.67272727 0.7254902 0.63636364
|
|
0.74074074 0.75925926 0.67272727 0.62711864]
|
|
|
|
mean value: 0.6798832986370374
|
|
|
|
key: train_jcc
|
|
value: [0.76709402 0.76231263 0.77440347 0.7775378 0.75583864 0.75687104
|
|
0.76759062 0.77969762 0.76857749 0.77874187]
|
|
|
|
mean value: 0.7688665198477691
|
|
|
|
MCC on Blind test: 0.44
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.32210398 0.26895761 0.34120345 0.32642722 0.34458899 0.4479835
|
|
0.36319351 0.43385816 0.46567249 0.45032525]
|
|
|
|
mean value: 0.37643141746520997
|
|
|
|
key: score_time
|
|
value: [0.01219916 0.01898623 0.01927805 0.01902223 0.02566624 0.01927495
|
|
0.02244329 0.02526784 0.02273488 0.02529621]
|
|
|
|
mean value: 0.021016907691955567
|
|
|
|
key: test_mcc
|
|
value: [0.6882472 0.42521003 0.54601891 0.59648091 0.65994555 0.54295079
|
|
0.69052856 0.70540345 0.5641598 0.50171077]
|
|
|
|
mean value: 0.592065596277015
|
|
|
|
key: train_mcc
|
|
value: [0.72661129 0.72075868 0.7378189 0.75824295 0.74898219 0.71197478
|
|
0.72691923 0.75450866 0.75617256 0.74359616]
|
|
|
|
mean value: 0.7385585392437356
|
|
|
|
key: test_accuracy
|
|
value: [0.84090909 0.70454545 0.77272727 0.79545455 0.82758621 0.77011494
|
|
0.83908046 0.85057471 0.7816092 0.74712644]
|
|
|
|
mean value: 0.7929728317659352
|
|
|
|
key: train_accuracy
|
|
value: [0.86132316 0.85877863 0.86768448 0.8778626 0.8729352 0.85387548
|
|
0.86149936 0.87547649 0.87547649 0.8703939 ]
|
|
|
|
mean value: 0.8675305779993598
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.74 0.77777778 0.80851064 0.83516484 0.77777778
|
|
0.85106383 0.86021505 0.79120879 0.77083333]
|
|
|
|
mean value: 0.8063615866898297
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:196: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_cd_sl.py:199: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.86819831 0.86512758 0.87286064 0.88264059 0.87864078 0.86161252
|
|
0.86851628 0.88106796 0.88221154 0.87560976]
|
|
|
|
mean value: 0.8736485943790752
|
|
|
|
key: test_precision
|
|
value: [0.8 0.66071429 0.76086957 0.76 0.79166667 0.74468085
|
|
0.78431373 0.81632653 0.76595745 0.71153846]
|
|
|
|
mean value: 0.7596067533111587
|
|
|
|
key: train_precision
|
|
value: [0.82718894 0.82790698 0.84 0.84941176 0.84186047 0.81922197
|
|
0.82758621 0.84222738 0.83599089 0.84074941]
|
|
|
|
mean value: 0.8352144002611301
|
|
|
|
key: test_recall
|
|
value: [0.90909091 0.84090909 0.79545455 0.86363636 0.88372093 0.81395349
|
|
0.93023256 0.90909091 0.81818182 0.84090909]
|
|
|
|
mean value: 0.8605179704016913
|
|
|
|
key: train_recall
|
|
value: [0.91348601 0.90585242 0.90839695 0.91857506 0.91878173 0.90862944
|
|
0.91370558 0.92366412 0.93384224 0.91348601]
|
|
|
|
mean value: 0.9158419550251223
|
|
|
|
key: test_roc_auc
|
|
value: [0.84090909 0.70454545 0.77272727 0.79545455 0.8282241 0.77061311
|
|
0.84011628 0.84989429 0.78118393 0.74603594]
|
|
|
|
mean value: 0.7929704016913319
|
|
|
|
key: train_roc_auc
|
|
value: [0.86132316 0.85877863 0.86768448 0.8778626 0.87287687 0.85380581
|
|
0.86143294 0.87553764 0.87555056 0.87044859]
|
|
|
|
mean value: 0.867530127484791
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.58730159 0.63636364 0.67857143 0.71698113 0.63636364
|
|
0.74074074 0.75471698 0.65454545 0.62711864]
|
|
|
|
mean value: 0.6773443981902568
|
|
|
|
key: train_jcc
|
|
value: [0.76709402 0.76231263 0.77440347 0.78993435 0.78354978 0.75687104
|
|
0.76759062 0.78741866 0.78924731 0.77874187]
|
|
|
|
mean value: 0.7757163746391412
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.74
|