19647 lines
966 KiB
Text
19647 lines
966 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_8020.py:549: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 858
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 858
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 168
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 175
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data with stratification: 80/20
|
|
Train data size: (358, 175)
|
|
Test data size: (90, 175)
|
|
y_train numbers: Counter({0: 282, 1: 76})
|
|
y_train ratio: 3.710526315789474
|
|
|
|
y_test_numbers: Counter({0: 71, 1: 19})
|
|
y_test ratio: 3.736842105263158
|
|
-------------------------------------------------------------
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 175)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 76, 1: 76})
|
|
(152, 175)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 175)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 175)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: 80/20 split
|
|
Gene name: embB
|
|
Drug name: ethambutol
|
|
|
|
Output directory: /home/tanu/git/Data/ethambutol/output/ml/tts_8020/
|
|
Sanity checks:
|
|
ML source data size: (448, 175)
|
|
Total input features: (358, 175)
|
|
Target feature numbers: Counter({0: 282, 1: 76})
|
|
Target features ratio: 3.710526315789474
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0428791 0.04306722 0.03533816 0.03511214 0.036165 0.04105043
|
|
0.03672242 0.0365274 0.03594804 0.03573561]
|
|
|
|
mean value: 0.037854552268981934
|
|
|
|
key: score_time
|
|
value: [0.0125525 0.01230192 0.0132637 0.01333547 0.01366806 0.01352763
|
|
0.0132668 0.01345658 0.01344943 0.01346922]
|
|
|
|
mean value: 0.013229131698608398
|
|
|
|
key: test_mcc
|
|
value: [0.8174367 0.49365725 0.44883281 0.75134288 0.51785714 0.45374261
|
|
0.51785714 0.67857143 0.71842121 0.72019314]
|
|
|
|
mean value: 0.6117912318227132
|
|
|
|
key: train_mcc
|
|
value: [0.79905267 0.83859776 0.82668723 0.78613568 0.81652347 0.8365424
|
|
0.79631634 0.84662994 0.82882139 0.80889737]
|
|
|
|
mean value: 0.818420423682095
|
|
|
|
key: test_accuracy
|
|
value: [0.94444444 0.86111111 0.83333333 0.91666667 0.83333333 0.83333333
|
|
0.83333333 0.88888889 0.91428571 0.91428571]
|
|
|
|
mean value: 0.8773015873015872
|
|
|
|
key: train_accuracy
|
|
value: [0.93478261 0.94720497 0.94409938 0.93167702 0.94099379 0.94720497
|
|
0.93478261 0.95031056 0.94427245 0.9380805 ]
|
|
|
|
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
0.9413408841797588
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.44444444 0.5 0.76923077 0.625 0.4
|
|
0.625 0.75 0.72727273 0.76923077]
|
|
|
|
mean value: 0.6443512043512043
|
|
|
|
key: train_fscore
|
|
value: [0.83464567 0.86821705 0.85714286 0.81666667 0.84552846 0.864
|
|
0.82644628 0.87096774 0.859375 0.84126984]
|
|
|
|
mean value: 0.8484259566846042
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.75 1. 0.625 1.
|
|
0.625 0.75 1. 0.83333333]
|
|
|
|
mean value: 0.8583333333333334
|
|
|
|
key: train_precision
|
|
value: [0.9137931 0.93333333 0.93103448 0.94230769 0.94545455 0.94736842
|
|
0.94339623 0.96428571 0.93220339 0.92982456]
|
|
|
|
mean value: 0.9383001470289924
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.28571429 0.375 0.625 0.625 0.25
|
|
0.625 0.75 0.57142857 0.71428571]
|
|
|
|
mean value: 0.5535714285714286
|
|
|
|
key: train_recall
|
|
value: [0.76811594 0.8115942 0.79411765 0.72058824 0.76470588 0.79411765
|
|
0.73529412 0.79411765 0.79710145 0.76811594]
|
|
|
|
mean value: 0.7747868712702473
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.64285714 0.66964286 0.8125 0.75892857 0.625
|
|
0.75892857 0.83928571 0.78571429 0.83928571]
|
|
|
|
mean value: 0.7589285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.87417655 0.89789196 0.88918481 0.85438861 0.87644743 0.89115331
|
|
0.86174155 0.89312182 0.89067671 0.87618396]
|
|
|
|
mean value: 0.8804966692724209
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.28571429 0.33333333 0.625 0.45454545 0.25
|
|
0.45454545 0.6 0.57142857 0.625 ]
|
|
|
|
mean value: 0.49138528138528137
|
|
|
|
key: train_jcc
|
|
value: [0.71621622 0.76712329 0.75 0.69014085 0.73239437 0.76056338
|
|
0.70422535 0.77142857 0.75342466 0.7260274 ]
|
|
|
|
mean value: 0.7371544073772512
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.90058208 0.80069208 0.91843367 0.97248459 0.82589054 0.95298767
|
|
0.91078734 0.78506351 1.01478314 0.77273583]
|
|
|
|
mean value: 0.8854440450668335
|
|
|
|
key: score_time
|
|
value: [0.0133543 0.01355958 0.0135107 0.01562595 0.01443386 0.01595211
|
|
0.01460838 0.01361895 0.01384115 0.01354361]
|
|
|
|
mean value: 0.014204859733581543
|
|
|
|
key: test_mcc
|
|
value: [0.75032247 0.61369649 0.65737574 0.75134288 0.41267736 0.45374261
|
|
0.67857143 0.77151675 0.61237244 0.61237244]
|
|
|
|
mean value: 0.6313990594842422
|
|
|
|
key: train_mcc
|
|
value: [0.85829157 0.96310935 0.98135711 0.96271422 0.97192696 0.89565519
|
|
0.96271422 1. 0.89735962 0.96314048]
|
|
|
|
mean value: 0.9456268716050406
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.88888889 0.88888889 0.91666667 0.80555556 0.83333333
|
|
0.88888889 0.91666667 0.88571429 0.88571429]
|
|
|
|
mean value: 0.8826984126984126
|
|
|
|
key: train_accuracy
|
|
value: [0.95341615 0.98757764 0.99378882 0.98757764 0.99068323 0.96583851
|
|
0.98757764 1. 0.96594427 0.9876161 ]
|
|
|
|
mean value: 0.982001999884622
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.6 0.71428571 0.76923077 0.53333333 0.4
|
|
0.75 0.82352941 0.6 0.66666667]
|
|
|
|
mean value: 0.665704589528119
|
|
|
|
key: train_fscore
|
|
value: [0.88549618 0.97101449 0.98529412 0.97058824 0.97777778 0.91603053
|
|
0.97058824 1. 0.91851852 0.97101449]
|
|
|
|
mean value: 0.9566322587596089
|
|
|
|
key: test_precision
|
|
value: [0.75 1. 0.83333333 1. 0.57142857 1.
|
|
0.75 0.77777778 1. 0.8 ]
|
|
|
|
mean value: 0.8482539682539683
|
|
|
|
key: train_precision
|
|
value: [0.93548387 0.97101449 0.98529412 0.97058824 0.98507463 0.95238095
|
|
0.97058824 1. 0.93939394 0.97101449]
|
|
|
|
mean value: 0.9680832963350846
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.42857143 0.625 0.625 0.5 0.25
|
|
0.75 0.875 0.42857143 0.57142857]
|
|
|
|
mean value: 0.5910714285714286
|
|
|
|
key: train_recall
|
|
value: [0.84057971 0.97101449 0.98529412 0.97058824 0.97058824 0.88235294
|
|
0.97058824 1. 0.89855072 0.97101449]
|
|
|
|
mean value: 0.9460571184995737
|
|
|
|
key: test_roc_auc
|
|
value: [0.89408867 0.71428571 0.79464286 0.8125 0.69642857 0.625
|
|
0.83928571 0.90178571 0.71428571 0.76785714]
|
|
|
|
mean value: 0.7760160098522167
|
|
|
|
key: train_roc_auc
|
|
value: [0.91238472 0.98155468 0.99067855 0.98135711 0.98332561 0.93527096
|
|
0.98135711 1. 0.94140135 0.98157024]
|
|
|
|
mean value: 0.968890032593287
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.42857143 0.55555556 0.625 0.36363636 0.25
|
|
0.6 0.7 0.42857143 0.5 ]
|
|
|
|
mean value: 0.5118001443001443
|
|
|
|
key: train_jcc
|
|
value: [0.79452055 0.94366197 0.97101449 0.94285714 0.95652174 0.84507042
|
|
0.94285714 1. 0.84931507 0.94366197]
|
|
|
|
mean value: 0.9189480500233883
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0145061 0.01411271 0.00958824 0.00958681 0.01035023 0.00946975
|
|
0.00950241 0.01032686 0.01084828 0.00958943]
|
|
|
|
mean value: 0.01078808307647705
|
|
|
|
key: score_time
|
|
value: [0.01501393 0.01024151 0.00904894 0.00896454 0.00931573 0.00904608
|
|
0.00889969 0.00926065 0.0098412 0.00942135]
|
|
|
|
mean value: 0.009905362129211425
|
|
|
|
key: test_mcc
|
|
value: [0.40804713 0.34527065 0.5157267 0.6172134 0.3086067 0.2438548
|
|
0.29366622 0.40089186 0.40147753 0.68640647]
|
|
|
|
mean value: 0.4221161472620327
|
|
|
|
key: train_mcc
|
|
value: [0.62471066 0.5915192 0.67760901 0.65520113 0.64418833 0.66117244
|
|
0.66633852 0.52011895 0.60022186 0.64725803]
|
|
|
|
mean value: 0.62883381442086
|
|
|
|
key: test_accuracy
|
|
value: [0.69444444 0.80555556 0.80555556 0.86111111 0.75 0.75
|
|
0.69444444 0.66666667 0.68571429 0.88571429]
|
|
|
|
mean value: 0.7599206349206349
|
|
|
|
key: train_accuracy
|
|
value: [0.85093168 0.83850932 0.87267081 0.86335404 0.8757764 0.86956522
|
|
0.86645963 0.72981366 0.79566563 0.85758514]
|
|
|
|
mean value: 0.8420331519335423
|
|
|
|
key: test_fscore
|
|
value: [0.52173913 0.46153846 0.63157895 0.70588235 0.47058824 0.4
|
|
0.47619048 0.53846154 0.52173913 0.75 ]
|
|
|
|
mean value: 0.5477718272663756
|
|
|
|
key: train_fscore
|
|
value: [0.70731707 0.68292683 0.74534161 0.72839506 0.72222222 0.73417722
|
|
0.73619632 0.60273973 0.67 0.72289157]
|
|
|
|
mean value: 0.705220762779721
|
|
|
|
key: test_precision
|
|
value: [0.375 0.5 0.54545455 0.66666667 0.44444444 0.42857143
|
|
0.38461538 0.38888889 0.375 0.66666667]
|
|
|
|
mean value: 0.4775308025308025
|
|
|
|
key: train_precision
|
|
value: [0.61052632 0.58947368 0.64516129 0.62765957 0.68421053 0.64444444
|
|
0.63157895 0.43708609 0.51145038 0.6185567 ]
|
|
|
|
mean value: 0.6000147958344869
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.42857143 0.75 0.75 0.5 0.375
|
|
0.625 0.875 0.85714286 0.85714286]
|
|
|
|
mean value: 0.6875
|
|
|
|
key: train_recall
|
|
value: [0.84057971 0.8115942 0.88235294 0.86764706 0.76470588 0.85294118
|
|
0.88235294 0.97058824 0.97101449 0.86956522]
|
|
|
|
mean value: 0.8713341858482523
|
|
|
|
key: test_roc_auc
|
|
value: [0.75615764 0.66256158 0.78571429 0.82142857 0.66071429 0.61607143
|
|
0.66964286 0.74107143 0.75 0.875 ]
|
|
|
|
mean value: 0.7338362068965517
|
|
|
|
key: train_roc_auc
|
|
value: [0.84716733 0.828722 0.87621584 0.86492589 0.83510885 0.86347846
|
|
0.87227883 0.81797128 0.85952299 0.86194796]
|
|
|
|
mean value: 0.8527339442515047
|
|
|
|
key: test_jcc
|
|
value: [0.35294118 0.3 0.46153846 0.54545455 0.30769231 0.25
|
|
0.3125 0.36842105 0.35294118 0.6 ]
|
|
|
|
mean value: 0.385148872025807
|
|
|
|
key: train_jcc
|
|
value: [0.54716981 0.51851852 0.59405941 0.57281553 0.56521739 0.58
|
|
0.58252427 0.43137255 0.5037594 0.56603774]
|
|
|
|
mean value: 0.5461474616274362
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01099086 0.01003337 0.010741 0.01009178 0.01001143 0.01029563
|
|
0.01055908 0.01037931 0.00990558 0.00981903]
|
|
|
|
mean value: 0.01028270721435547
|
|
|
|
key: score_time
|
|
value: [0.00966334 0.00921583 0.00926971 0.009969 0.00917459 0.00983572
|
|
0.00905895 0.0090332 0.00921869 0.00948954]
|
|
|
|
mean value: 0.009392857551574707
|
|
|
|
key: test_mcc
|
|
value: [ 0.75032247 0.2085873 0.16205093 0.47809144 0.58149992 -0.18898224
|
|
0.35714286 0.0805823 0.49391458 0.10206207]
|
|
|
|
mean value: 0.30252716321584283
|
|
|
|
key: train_mcc
|
|
value: [0.39769343 0.45897008 0.43461577 0.4144431 0.49728141 0.50633817
|
|
0.48412839 0.39761525 0.41442016 0.42938548]
|
|
|
|
mean value: 0.4434891238000965
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.77777778 0.77777778 0.83333333 0.86111111 0.66666667
|
|
0.77777778 0.75 0.85714286 0.77142857]
|
|
|
|
mean value: 0.798968253968254
|
|
|
|
key: train_accuracy
|
|
value: [0.81677019 0.83850932 0.82608696 0.82608696 0.84782609 0.85093168
|
|
0.84161491 0.81987578 0.82352941 0.82352941]
|
|
|
|
mean value: 0.8314760686883449
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.33333333 0.2 0.57142857 0.66666667 0.
|
|
0.5 0.18181818 0.54545455 0.2 ]
|
|
|
|
mean value: 0.3998701298701299
|
|
|
|
key: train_fscore
|
|
value: [0.4957265 0.52727273 0.53333333 0.5 0.57391304 0.57894737
|
|
0.57142857 0.49122807 0.50434783 0.52892562]
|
|
|
|
mean value: 0.5305123055757547
|
|
|
|
key: test_precision
|
|
value: [0.75 0.4 0.5 0.66666667 0.71428571 0.
|
|
0.5 0.33333333 0.75 0.33333333]
|
|
|
|
mean value: 0.49476190476190474
|
|
|
|
key: train_precision
|
|
value: [0.60416667 0.70731707 0.61538462 0.63636364 0.70212766 0.7173913
|
|
0.66666667 0.60869565 0.63043478 0.61538462]
|
|
|
|
mean value: 0.6503932672341834
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.28571429 0.125 0.5 0.625 0.
|
|
0.5 0.125 0.42857143 0.14285714]
|
|
|
|
mean value: 0.35892857142857143
|
|
|
|
key: train_recall
|
|
value: [0.42028986 0.42028986 0.47058824 0.41176471 0.48529412 0.48529412
|
|
0.5 0.41176471 0.42028986 0.46376812]
|
|
|
|
mean value: 0.44893435635123613
|
|
|
|
key: test_roc_auc
|
|
value: [0.89408867 0.591133 0.54464286 0.71428571 0.77678571 0.42857143
|
|
0.67857143 0.52678571 0.69642857 0.53571429]
|
|
|
|
mean value: 0.6387007389162562
|
|
|
|
key: train_roc_auc
|
|
value: [0.67259552 0.68642951 0.69592404 0.67438629 0.715088 0.71705651
|
|
0.71653543 0.67044928 0.67668036 0.69251398]
|
|
|
|
mean value: 0.6917658928125731
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.2 0.11111111 0.4 0.5 0.
|
|
0.33333333 0.1 0.375 0.11111111]
|
|
|
|
mean value: 0.2797222222222222
|
|
|
|
key: train_jcc
|
|
value: [0.32954545 0.35802469 0.36363636 0.33333333 0.40243902 0.40740741
|
|
0.4 0.3255814 0.3372093 0.35955056]
|
|
|
|
mean value: 0.3616727534142999
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00918794 0.0107317 0.01008606 0.0098629 0.00994635 0.00929689
|
|
0.01015115 0.01002431 0.01017118 0.01010108]
|
|
|
|
mean value: 0.009955954551696778
|
|
|
|
key: score_time
|
|
value: [0.08292389 0.01307893 0.01189065 0.01137209 0.01231623 0.0125916
|
|
0.01275134 0.01540232 0.01305771 0.01362872]
|
|
|
|
mean value: 0.019901347160339356
|
|
|
|
key: test_mcc
|
|
value: [-0.08304548 0.34404556 -0.12964074 0.45374261 0.45374261 -0.09035079
|
|
0.32232919 0.31622777 -0.08574929 -0.08574929]
|
|
|
|
mean value: 0.1415552123996879
|
|
|
|
key: train_mcc
|
|
value: [0.49666776 0.39869846 0.50648694 0.43431192 0.43289908 0.46108514
|
|
0.37190677 0.38709528 0.47029901 0.47029901]
|
|
|
|
mean value: 0.4429749369850348
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.83333333 0.72222222 0.83333333 0.83333333 0.75
|
|
0.80555556 0.80555556 0.77142857 0.77142857]
|
|
|
|
mean value: 0.7903968253968254
|
|
|
|
key: train_accuracy
|
|
value: [0.85093168 0.82919255 0.85403727 0.83850932 0.83850932 0.8447205
|
|
0.82608696 0.82919255 0.84520124 0.84520124]
|
|
|
|
mean value: 0.8401582601003789
|
|
|
|
key: test_fscore
|
|
value: [0. 0.25 0. 0.4 0.4 0.
|
|
0.36363636 0.22222222 0. 0. ]
|
|
|
|
mean value: 0.16358585858585858
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[0.51020408 0.38202247 0.48351648 0.40909091 0.42222222 0.46808511
|
|
0.33333333 0.36781609 0.47916667 0.47916667]
|
|
|
|
mean value: 0.4334624033376049
|
|
|
|
key: test_precision
|
|
value: [0. 1. 0. 1. 1. 0.
|
|
0.66666667 1. 0. 0. ]
|
|
|
|
mean value: 0.4666666666666667
|
|
|
|
key: train_precision
|
|
value: [0.86206897 0.85 0.95652174 0.9 0.86363636 0.84615385
|
|
0.875 0.84210526 0.85185185 0.85185185]
|
|
|
|
mean value: 0.8699189881299484
|
|
|
|
key: test_recall
|
|
value: [0. 0.14285714 0. 0.25 0.25 0.
|
|
0.25 0.125 0. 0. ]
|
|
|
|
mean value: 0.10178571428571428
|
|
|
|
key: train_recall
|
|
value: [0.36231884 0.24637681 0.32352941 0.26470588 0.27941176 0.32352941
|
|
0.20588235 0.23529412 0.33333333 0.33333333]
|
|
|
|
mean value: 0.290771526001705
|
|
|
|
key: test_roc_auc
|
|
value: [0.48275862 0.57142857 0.46428571 0.625 0.625 0.48214286
|
|
0.60714286 0.5625 0.48214286 0.48214286]
|
|
|
|
mean value: 0.538454433497537
|
|
|
|
key: train_roc_auc
|
|
value: [0.67325428 0.61725955 0.6597962 0.62841593 0.63380037 0.65389069
|
|
0.59900417 0.61174155 0.65879265 0.65879265]
|
|
|
|
mean value: 0.6394748047362482
|
|
|
|
key: test_jcc
|
|
value: [0. 0.14285714 0. 0.25 0.25 0.
|
|
0.22222222 0.125 0. 0. ]
|
|
|
|
mean value: 0.0990079365079365
|
|
|
|
key: train_jcc
|
|
value: [0.34246575 0.23611111 0.31884058 0.25714286 0.26760563 0.30555556
|
|
0.2 0.22535211 0.31506849 0.31506849]
|
|
|
|
mean value: 0.2783210589724569
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01502514 0.01375127 0.01382565 0.01428938 0.01496148 0.01451397
|
|
0.01644087 0.01448464 0.01387644 0.01542759]
|
|
|
|
mean value: 0.014659643173217773
|
|
|
|
key: score_time
|
|
value: [0.01057029 0.01050878 0.01076698 0.01031327 0.01009369 0.01014829
|
|
0.01023912 0.0101285 0.01013803 0.01021695]
|
|
|
|
mean value: 0.010312390327453614
|
|
|
|
key: test_mcc
|
|
value: [0.49365725 0. 0.16205093 0.31622777 0.47809144 0.
|
|
0.2362278 0.44883281 0. 0.49236596]
|
|
|
|
mean value: 0.26274539603801333
|
|
|
|
key: train_mcc
|
|
value: [0.59932645 0.65172653 0.65308612 0.65308612 0.63080736 0.67251176
|
|
0.65979056 0.65979056 0.63347284 0.6251464 ]
|
|
|
|
mean value: 0.6438744675217546
|
|
|
|
key: test_accuracy
|
|
value: [0.86111111 0.80555556 0.77777778 0.80555556 0.83333333 0.77777778
|
|
0.77777778 0.83333333 0.8 0.85714286]
|
|
|
|
mean value: 0.812936507936508
|
|
|
|
key: train_accuracy
|
|
value: [0.8757764 0.89130435 0.89130435 0.89130435 0.88509317 0.89751553
|
|
0.89440994 0.89440994 0.88544892 0.88235294]
|
|
|
|
mean value: 0.8888919870007499
|
|
|
|
key: test_fscore
|
|
value: [0.44444444 0. 0.2 0.22222222 0.57142857 0.
|
|
0.33333333 0.5 0. 0.44444444]
|
|
|
|
mean value: 0.2715873015873016
|
|
|
|
key: train_fscore
|
|
value: [0.6 0.68468468 0.65346535 0.65346535 0.62626263 0.68571429
|
|
0.67924528 0.67924528 0.6407767 0.62 ]
|
|
|
|
mean value: 0.6522859554797765
|
|
|
|
key: test_precision
|
|
value: [1. 0. 0.5 1. 0.66666667 0.
|
|
0.5 0.75 0. 1. ]
|
|
|
|
mean value: 0.5416666666666666
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.9047619 1. 1. 1. 0.97297297
|
|
0.94736842 0.94736842 0.97058824 1. ]
|
|
|
|
mean value: 0.9710801890618129
|
|
|
|
key: test_recall
|
|
value: [0.28571429 0. 0.125 0.125 0.5 0.
|
|
0.25 0.375 0. 0.28571429]
|
|
|
|
mean value: 0.19464285714285715
|
|
|
|
key: train_recall
|
|
value: [0.43478261 0.55072464 0.48529412 0.48529412 0.45588235 0.52941176
|
|
0.52941176 0.52941176 0.47826087 0.44927536]
|
|
|
|
mean value: 0.49277493606138106
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.5 0.54464286 0.5625 0.71428571 0.5
|
|
0.58928571 0.66964286 0.5 0.64285714]
|
|
|
|
mean value: 0.5866071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.71541502 0.76745718 0.74264706 0.74264706 0.72794118 0.76273738
|
|
0.76076887 0.76076887 0.73716193 0.72463768]
|
|
|
|
mean value: 0.7442182233759956
|
|
|
|
key: test_jcc
|
|
value: [0.28571429 0. 0.11111111 0.125 0.4 0.
|
|
0.2 0.33333333 0. 0.28571429]
|
|
|
|
mean value: 0.1740873015873016
|
|
|
|
key: train_jcc
|
|
value: [0.42857143 0.52054795 0.48529412 0.48529412 0.45588235 0.52173913
|
|
0.51428571 0.51428571 0.47142857 0.44927536]
|
|
|
|
mean value: 0.4846604454765825
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.45310259 1.60784292 1.29070163 1.57894993 1.62366939 1.49906635
|
|
1.52056623 1.68461752 1.8801136 1.26551223]
|
|
|
|
mean value: 1.5404142379760741
|
|
|
|
key: score_time
|
|
value: [0.0127666 0.01392817 0.01249576 0.01368737 0.0171051 0.01363277
|
|
0.01388717 0.02197933 0.01406717 0.01296186]
|
|
|
|
mean value: 0.014651131629943848
|
|
|
|
key: test_mcc
|
|
value: [0.68887476 0.49365725 0.36493797 0.75134288 0.46291005 0.16205093
|
|
0.67857143 0.6172134 0.61237244 0.61237244]
|
|
|
|
mean value: 0.5444303548709625
|
|
|
|
key: train_mcc
|
|
value: [0.98155468 0.99086739 0.98135711 0.98135711 0.97192696 0.97192696
|
|
0.98135711 0.9906716 0.98157024 0.9722504 ]
|
|
|
|
mean value: 0.9804839554759929
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 0.86111111 0.80555556 0.91666667 0.80555556 0.77777778
|
|
0.88888889 0.86111111 0.88571429 0.88571429]
|
|
|
|
mean value: 0.8576984126984127
|
|
|
|
key: train_accuracy
|
|
value: [0.99378882 0.99689441 0.99378882 0.99378882 0.99068323 0.99068323
|
|
0.99378882 0.99689441 0.99380805 0.99071207]
|
|
|
|
mean value: 0.993483068284522
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.44444444 0.46153846 0.76923077 0.58823529 0.2
|
|
0.75 0.70588235 0.6 0.66666667]
|
|
|
|
mean value: 0.5935997988939166
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 0.99280576 0.98529412 0.98529412 0.97777778 0.97777778
|
|
0.98529412 0.99259259 0.98550725 0.97810219]
|
|
|
|
mean value: 0.9845952939019653
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.6 1. 0.55555556 0.5
|
|
0.75 0.66666667 1. 0.8 ]
|
|
|
|
mean value: 0.7538888888888888
|
|
|
|
key: train_precision
|
|
value: [0.98550725 0.98571429 0.98529412 0.98529412 0.98507463 0.98507463
|
|
0.98529412 1. 0.98550725 0.98529412]
|
|
|
|
mean value: 0.9868054502787488
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.28571429 0.375 0.625 0.625 0.125
|
|
0.75 0.75 0.42857143 0.57142857]
|
|
|
|
mean value: 0.5392857142857143
|
|
|
|
key: train_recall
|
|
value: [0.98550725 1. 0.98529412 0.98529412 0.97058824 0.97058824
|
|
0.98529412 0.98529412 0.98550725 0.97101449]
|
|
|
|
mean value: 0.9824381926683717
|
|
|
|
key: test_roc_auc
|
|
value: [0.87684729 0.64285714 0.65178571 0.8125 0.74107143 0.54464286
|
|
0.83928571 0.82142857 0.71428571 0.76785714]
|
|
|
|
mean value: 0.741256157635468
|
|
|
|
key: train_roc_auc
|
|
value: [0.99077734 0.99802372 0.99067855 0.99067855 0.98332561 0.98332561
|
|
0.99067855 0.99264706 0.99078512 0.98353874]
|
|
|
|
mean value: 0.9894458866612843
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.28571429 0.3 0.625 0.41666667 0.11111111
|
|
0.6 0.54545455 0.42857143 0.5 ]
|
|
|
|
mean value: 0.44125180375180373
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.98571429 0.97101449 0.97101449 0.95652174 0.95652174
|
|
0.97101449 0.98529412 0.97142857 0.95714286]
|
|
|
|
mean value: 0.9697095359883083
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02383637 0.02124 0.01789117 0.01587868 0.0169847 0.01740932
|
|
0.01692271 0.01825738 0.01615906 0.01933932]
|
|
|
|
mean value: 0.018391871452331544
|
|
|
|
key: score_time
|
|
value: [0.01232696 0.010185 0.00895429 0.00890899 0.00881505 0.00891113
|
|
0.00893497 0.00903082 0.00903559 0.00908685]
|
|
|
|
mean value: 0.009418964385986328
|
|
|
|
key: test_mcc
|
|
value: [0.8174367 0.75032247 1. 0.91914503 0.53300179 0.66143783
|
|
0.51785714 0.86189161 0.49391458 0.81649658]
|
|
|
|
mean value: 0.737150373784473
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94444444 0.91666667 1. 0.97222222 0.77777778 0.88888889
|
|
0.83333333 0.94444444 0.85714286 0.94285714]
|
|
|
|
mean value: 0.9077777777777778
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 1. 0.93333333 0.63636364 0.66666667
|
|
0.625 0.88888889 0.54545455 0.83333333]
|
|
|
|
mean value: 0.7762373737373737
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 1. 0.5 1. 0.625 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8425
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.875 0.875 0.5
|
|
0.625 1. 0.42857143 0.71428571]
|
|
|
|
mean value: 0.7589285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.89408867 1. 0.9375 0.8125 0.75
|
|
0.75892857 0.96428571 0.69642857 0.85714286]
|
|
|
|
mean value: 0.8528017241379311
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 1. 0.875 0.46666667 0.5
|
|
0.45454545 0.8 0.375 0.71428571]
|
|
|
|
mean value: 0.6566450216450217
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10711122 0.10529757 0.10800672 0.10813236 0.10637426 0.10711551
|
|
0.106498 0.10879707 0.10712099 0.10767961]
|
|
|
|
mean value: 0.10721333026885986
|
|
|
|
key: score_time
|
|
value: [0.01776719 0.01798534 0.01802421 0.01788712 0.01799774 0.01796985
|
|
0.01869559 0.0179739 0.01853013 0.01776457]
|
|
|
|
mean value: 0.018059563636779786
|
|
|
|
key: test_mcc
|
|
value: [0.71962292 0.1872493 0.65737574 0.66143783 0.56354451 0.16205093
|
|
0.58149992 0.58149992 0.34299717 0.71842121]
|
|
|
|
mean value: 0.5175699436874827
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.80555556 0.88888889 0.88888889 0.83333333 0.77777778
|
|
0.86111111 0.86111111 0.82857143 0.91428571]
|
|
|
|
mean value: 0.8576190476190476
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.22222222 0.71428571 0.66666667 0.66666667 0.2
|
|
0.66666667 0.66666667 0.25 0.72727273]
|
|
|
|
mean value: 0.5507720057720058
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.5 0.83333333 1. 0.6 0.5
|
|
0.71428571 0.71428571 1. 1. ]
|
|
|
|
mean value: 0.7861904761904762
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.14285714 0.625 0.5 0.75 0.125
|
|
0.625 0.625 0.14285714 0.57142857]
|
|
|
|
mean value: 0.46785714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.55418719 0.79464286 0.75 0.80357143 0.54464286
|
|
0.77678571 0.77678571 0.57142857 0.78571429]
|
|
|
|
mean value: 0.7143472906403942
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.125 0.55555556 0.5 0.5 0.11111111
|
|
0.5 0.5 0.14285714 0.57142857]
|
|
|
|
mean value: 0.40773809523809523
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.68
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00979185 0.01068974 0.01069331 0.00985026 0.0098598 0.00974846
|
|
0.01084542 0.00980043 0.01022649 0.01085448]
|
|
|
|
mean value: 0.010236024856567383
|
|
|
|
key: score_time
|
|
value: [0.00915313 0.00982594 0.00975847 0.00891304 0.00955153 0.00960112
|
|
0.0089066 0.00889993 0.00894642 0.00965571]
|
|
|
|
mean value: 0.009321188926696778
|
|
|
|
key: test_mcc
|
|
value: [ 0.43895468 -0.03138824 0.19642857 0.2438548 0.29366622 0.75134288
|
|
0.26519742 0.07503225 0.49391458 0.15161961]
|
|
|
|
mean value: 0.28786227687707744
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72222222 0.69444444 0.72222222 0.75 0.69444444 0.91666667
|
|
0.72222222 0.69444444 0.85714286 0.74285714]
|
|
|
|
mean value: 0.7516666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.54545455 0.15384615 0.375 0.4 0.47619048 0.76923077
|
|
0.44444444 0.26666667 0.54545455 0.30769231]
|
|
|
|
mean value: 0.4283979908979909
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.4 0.16666667 0.375 0.42857143 0.38461538 1.
|
|
0.4 0.28571429 0.75 0.33333333]
|
|
|
|
mean value: 0.4523901098901099
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.14285714 0.375 0.375 0.625 0.625
|
|
0.5 0.25 0.42857143 0.28571429]
|
|
|
|
mean value: 0.4464285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.77339901 0.48522167 0.59821429 0.61607143 0.66964286 0.8125
|
|
0.64285714 0.53571429 0.69642857 0.57142857]
|
|
|
|
mean value: 0.6401477832512316
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.375 0.08333333 0.23076923 0.25 0.3125 0.625
|
|
0.28571429 0.15384615 0.375 0.18181818]
|
|
|
|
mean value: 0.28729811854811854
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.50880575 1.47889471 1.49357581 1.51851034 1.48310709 1.48366189
|
|
1.48690367 1.52164292 1.48334312 1.50404835]
|
|
|
|
mean value: 1.4962493658065796
|
|
|
|
key: score_time
|
|
value: [0.09659505 0.09937906 0.0998745 0.09763646 0.09497857 0.09289384
|
|
0.09455442 0.09473515 0.09559774 0.09992361]
|
|
|
|
mean value: 0.09661684036254883
|
|
|
|
key: test_mcc
|
|
value: [1. 0.49365725 0.83666003 0.66143783 0.67857143 0.75134288
|
|
0.77151675 0.77151675 0.61237244 0.90971765]
|
|
|
|
mean value: 0.7486793002774902
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.86111111 0.94444444 0.88888889 0.88888889 0.91666667
|
|
0.91666667 0.91666667 0.88571429 0.97142857]
|
|
|
|
mean value: 0.919047619047619
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.44444444 0.85714286 0.66666667 0.75 0.76923077
|
|
0.82352941 0.82352941 0.6 0.92307692]
|
|
|
|
mean value: 0.7657620484091072
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 0.75 1.
|
|
0.77777778 0.77777778 1. 1. ]
|
|
|
|
mean value: 0.9305555555555556
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.28571429 0.75 0.5 0.75 0.625
|
|
0.875 0.875 0.42857143 0.85714286]
|
|
|
|
mean value: 0.6946428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.64285714 0.875 0.75 0.83928571 0.8125
|
|
0.90178571 0.90178571 0.71428571 0.92857143]
|
|
|
|
mean value: 0.8366071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.28571429 0.75 0.5 0.6 0.625
|
|
0.7 0.7 0.42857143 0.85714286]
|
|
|
|
mean value: 0.6446428571428571
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.83
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.79229784 0.95148969 0.91167283 0.94181609 0.88025999 0.92065978
|
|
0.91286707 0.9315846 0.88310695 0.94630051]
|
|
|
|
mean value: 1.0072055339813233
|
|
|
|
key: score_time
|
|
value: [0.20599699 0.3146832 0.24151015 0.27133441 0.13036633 0.22741055
|
|
0.23978066 0.25486302 0.21415901 0.17490315]
|
|
|
|
mean value: 0.2275007486343384
|
|
|
|
key: test_mcc
|
|
value: [0.71962292 0.34404556 0.66143783 0.66143783 0.47809144 0.45374261
|
|
0.75032247 0.47809144 0.49236596 0.61237244]
|
|
|
|
mean value: 0.565153049910657
|
|
|
|
key: train_mcc
|
|
value: [0.88709235 0.92532149 0.9244842 0.93401658 0.9244842 0.9244842
|
|
0.90518666 0.90534273 0.92534731 0.90647794]
|
|
|
|
mean value: 0.9162237656285032
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.83333333 0.88888889 0.88888889 0.83333333 0.83333333
|
|
0.91666667 0.83333333 0.85714286 0.88571429]
|
|
|
|
mean value: 0.8687301587301587
|
|
|
|
key: train_accuracy
|
|
value: [0.96273292 0.97515528 0.97515528 0.97826087 0.97515528 0.97515528
|
|
0.9689441 0.9689441 0.9752322 0.96904025]
|
|
|
|
mean value: 0.9723775551410495
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.25 0.66666667 0.66666667 0.57142857 0.4
|
|
0.8 0.57142857 0.44444444 0.6 ]
|
|
|
|
mean value: 0.5697907647907648
|
|
|
|
key: train_fscore
|
|
value: [0.90769231 0.93939394 0.93846154 0.94656489 0.93846154 0.93846154
|
|
0.92307692 0.921875 0.94029851 0.92307692]
|
|
|
|
mean value: 0.9317363101583578
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 0.66666667 1.
|
|
0.85714286 0.66666667 1. 1. ]
|
|
|
|
mean value: 0.919047619047619
|
|
|
|
key: train_precision
|
|
value: [0.96721311 0.98412698 0.98387097 0.98412698 0.98387097 0.98387097
|
|
0.96774194 0.98333333 0.96923077 0.98360656]
|
|
|
|
mean value: 0.9790992581658896
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.14285714 0.5 0.5 0.5 0.25
|
|
0.75 0.5 0.28571429 0.42857143]
|
|
|
|
mean value: 0.44285714285714284
|
|
|
|
key: train_recall
|
|
value: [0.85507246 0.89855072 0.89705882 0.91176471 0.89705882 0.89705882
|
|
0.88235294 0.86764706 0.91304348 0.86956522]
|
|
|
|
mean value: 0.8889173060528559
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.57142857 0.75 0.75 0.71428571 0.625
|
|
0.85714286 0.71428571 0.64285714 0.71428571]
|
|
|
|
mean value: 0.7125
|
|
|
|
key: train_roc_auc
|
|
value: [0.92358366 0.94729908 0.94656091 0.95391385 0.94656091 0.94656091
|
|
0.93723946 0.93185503 0.95258473 0.9328141 ]
|
|
|
|
mean value: 0.941897263713926
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.14285714 0.5 0.5 0.4 0.25
|
|
0.66666667 0.4 0.28571429 0.42857143]
|
|
|
|
mean value: 0.4145238095238095
|
|
|
|
key: train_jcc
|
|
value: [0.83098592 0.88571429 0.88405797 0.89855072 0.88405797 0.88405797
|
|
0.85714286 0.85507246 0.88732394 0.85714286]
|
|
|
|
mean value: 0.8724106960604205
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02311754 0.00943446 0.00950122 0.00938201 0.00950313 0.00959158
|
|
0.00952291 0.00950456 0.00949812 0.00953507]
|
|
|
|
mean value: 0.010859060287475585
|
|
|
|
key: score_time
|
|
value: [0.01207566 0.0087018 0.00878215 0.00869799 0.00875616 0.00878167
|
|
0.0086906 0.00866127 0.00872397 0.00867581]
|
|
|
|
mean value: 0.00905470848083496
|
|
|
|
key: test_mcc
|
|
value: [ 0.75032247 0.2085873 0.16205093 0.47809144 0.58149992 -0.18898224
|
|
0.35714286 0.0805823 0.49391458 0.10206207]
|
|
|
|
mean value: 0.30252716321584283
|
|
|
|
key: train_mcc
|
|
value: [0.39769343 0.45897008 0.43461577 0.4144431 0.49728141 0.50633817
|
|
0.48412839 0.39761525 0.41442016 0.42938548]
|
|
|
|
mean value: 0.4434891238000965
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.77777778 0.77777778 0.83333333 0.86111111 0.66666667
|
|
0.77777778 0.75 0.85714286 0.77142857]
|
|
|
|
mean value: 0.798968253968254
|
|
|
|
key: train_accuracy
|
|
value: [0.81677019 0.83850932 0.82608696 0.82608696 0.84782609 0.85093168
|
|
0.84161491 0.81987578 0.82352941 0.82352941]
|
|
|
|
mean value: 0.8314760686883449
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.33333333 0.2 0.57142857 0.66666667 0.
|
|
0.5 0.18181818 0.54545455 0.2 ]
|
|
|
|
mean value: 0.3998701298701299
|
|
|
|
key: train_fscore
|
|
value: [0.4957265 0.52727273 0.53333333 0.5 0.57391304 0.57894737
|
|
0.57142857 0.49122807 0.50434783 0.52892562]
|
|
|
|
mean value: 0.5305123055757547
|
|
|
|
key: test_precision
|
|
value: [0.75 0.4 0.5 0.66666667 0.71428571 0.
|
|
0.5 0.33333333 0.75 0.33333333]
|
|
|
|
mean value: 0.49476190476190474
|
|
|
|
key: train_precision
|
|
value: [0.60416667 0.70731707 0.61538462 0.63636364 0.70212766 0.7173913
|
|
0.66666667 0.60869565 0.63043478 0.61538462]
|
|
|
|
mean value: 0.6503932672341834
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.28571429 0.125 0.5 0.625 0.
|
|
0.5 0.125 0.42857143 0.14285714]
|
|
|
|
mean value: 0.35892857142857143
|
|
|
|
key: train_recall
|
|
value: [0.42028986 0.42028986 0.47058824 0.41176471 0.48529412 0.48529412
|
|
0.5 0.41176471 0.42028986 0.46376812]
|
|
|
|
mean value: 0.44893435635123613
|
|
|
|
key: test_roc_auc
|
|
value: [0.89408867 0.591133 0.54464286 0.71428571 0.77678571 0.42857143
|
|
0.67857143 0.52678571 0.69642857 0.53571429]
|
|
|
|
mean value: 0.6387007389162562
|
|
|
|
key: train_roc_auc
|
|
value: [0.67259552 0.68642951 0.69592404 0.67438629 0.715088 0.71705651
|
|
0.71653543 0.67044928 0.67668036 0.69251398]
|
|
|
|
mean value: 0.6917658928125731
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.2 0.11111111 0.4 0.5 0.
|
|
0.33333333 0.1 0.375 0.11111111]
|
|
|
|
mean value: 0.2797222222222222
|
|
|
|
key: train_jcc
|
|
value: [0.32954545 0.35802469 0.36363636 0.33333333 0.40243902 0.40740741
|
|
0.4 0.3255814 0.3372093 0.35955056]
|
|
|
|
mean value: 0.3616727534142999
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.09641838 0.05527234 0.0620811 0.07439089 0.05781317 0.06530595
|
|
0.06510544 0.08454537 0.20302868 0.05387855]
|
|
|
|
mean value: 0.08178398609161378
|
|
|
|
key: score_time
|
|
value: [0.0110662 0.01033831 0.01064897 0.01097441 0.01094913 0.01161528
|
|
0.01144314 0.01093102 0.01129246 0.01139951]
|
|
|
|
mean value: 0.011065840721130371
|
|
|
|
key: test_mcc
|
|
value: [1. 0.91914503 1. 0.91914503 0.80582296 0.91914503
|
|
0.9258201 0.86189161 0.72019314 0.90971765]
|
|
|
|
mean value: 0.8980880554918083
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97222222 1. 0.97222222 0.91666667 0.97222222
|
|
0.97222222 0.94444444 0.91428571 0.97142857]
|
|
|
|
mean value: 0.9635714285714285
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93333333 1. 0.93333333 0.84210526 0.93333333
|
|
0.94117647 0.88888889 0.76923077 0.92307692]
|
|
|
|
mean value: 0.9164478314942711
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 1. 1. 0.72727273 1.
|
|
0.88888889 0.8 0.83333333 1. ]
|
|
|
|
mean value: 0.912449494949495
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.875 1. 0.875
|
|
1. 1. 0.71428571 0.85714286]
|
|
|
|
mean value: 0.9321428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 1. 0.9375 0.94642857 0.9375
|
|
0.98214286 0.96428571 0.83928571 0.92857143]
|
|
|
|
mean value: 0.9518472906403941
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.875 1. 0.875 0.72727273 0.875
|
|
0.88888889 0.8 0.625 0.85714286]
|
|
|
|
mean value: 0.8523304473304474
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.91
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04570127 0.09128404 0.06899452 0.07317495 0.06892586 0.06894684
|
|
0.07744288 0.03692055 0.05794311 0.05879235]
|
|
|
|
mean value: 0.06481263637542725
|
|
|
|
key: score_time
|
|
value: [0.01260543 0.01506281 0.02454662 0.02456522 0.02190804 0.01689649
|
|
0.01267576 0.01267529 0.01691985 0.01226139]
|
|
|
|
mean value: 0.017011690139770507
|
|
|
|
key: test_mcc
|
|
value: [0.72192954 0.6453202 0.55814043 0.5157267 0.66077483 0.67857143
|
|
0.58149992 0.6172134 0.81649658 0.72019314]
|
|
|
|
mean value: 0.6515866163842986
|
|
|
|
key: train_mcc
|
|
value: [0.9358192 0.9358192 0.93513953 0.93513953 0.93513953 0.93513953
|
|
0.93513953 0.95318232 0.93587381 0.92628095]
|
|
|
|
mean value: 0.9362673146575943
|
|
|
|
key: test_accuracy
|
|
value: [0.91666667 0.88888889 0.86111111 0.80555556 0.86111111 0.88888889
|
|
0.86111111 0.86111111 0.94285714 0.91428571]
|
|
|
|
mean value: 0.8801587301587301
|
|
|
|
key: train_accuracy
|
|
value: [0.97826087 0.97826087 0.97826087 0.97826087 0.97826087 0.97826087
|
|
0.97826087 0.98447205 0.97832817 0.9752322 ]
|
|
|
|
mean value: 0.9785858508162991
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.71428571 0.61538462 0.63157895 0.73684211 0.75
|
|
0.66666667 0.70588235 0.83333333 0.76923077]
|
|
|
|
mean value: 0.7192435273704624
|
|
|
|
key: train_fscore
|
|
value: [0.94964029 0.94964029 0.94890511 0.94890511 0.94890511 0.94890511
|
|
0.94890511 0.96296296 0.94964029 0.94202899]
|
|
|
|
mean value: 0.9498438359224818
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.71428571 0.8 0.54545455 0.63636364 0.75
|
|
0.71428571 0.66666667 1. 0.83333333]
|
|
|
|
mean value: 0.7493722943722944
|
|
|
|
key: train_precision
|
|
value: [0.94285714 0.94285714 0.94202899 0.94202899 0.94202899 0.94202899
|
|
0.94202899 0.97014925 0.94285714 0.94202899]
|
|
|
|
mean value: 0.945089459534625
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 0.5 0.75 0.875 0.75
|
|
0.625 0.75 0.71428571 0.71428571]
|
|
|
|
mean value: 0.7107142857142857
|
|
|
|
key: train_recall
|
|
value: [0.95652174 0.95652174 0.95588235 0.95588235 0.95588235 0.95588235
|
|
0.95588235 0.95588235 0.95652174 0.94202899]
|
|
|
|
mean value: 0.954688832054561
|
|
|
|
key: test_roc_auc
|
|
value: [0.83990148 0.8226601 0.73214286 0.78571429 0.86607143 0.83928571
|
|
0.77678571 0.82142857 0.85714286 0.83928571]
|
|
|
|
mean value: 0.8180418719211823
|
|
|
|
key: train_roc_auc
|
|
value: [0.97035573 0.97035573 0.97006716 0.97006716 0.97006716 0.97006716
|
|
0.97006716 0.97400417 0.97038685 0.96314048]
|
|
|
|
mean value: 0.9698578765482728
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.55555556 0.44444444 0.46153846 0.58333333 0.6
|
|
0.5 0.54545455 0.71428571 0.625 ]
|
|
|
|
mean value: 0.5654612054612055
|
|
|
|
key: train_jcc
|
|
value: [0.90410959 0.90410959 0.90277778 0.90277778 0.90277778 0.90277778
|
|
0.90277778 0.92857143 0.90410959 0.89041096]
|
|
|
|
mean value: 0.9045200043487714
|
|
|
|
MCC on Blind test: 0.68
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01387548 0.01329589 0.00991011 0.01070857 0.00993299 0.00958133
|
|
0.01013231 0.00957942 0.00979066 0.00977588]
|
|
|
|
mean value: 0.01065826416015625
|
|
|
|
key: score_time
|
|
value: [0.01285505 0.01248264 0.00921583 0.00976205 0.0090754 0.00870442
|
|
0.00901771 0.0088408 0.00888538 0.00878716]
|
|
|
|
mean value: 0.00976264476776123
|
|
|
|
key: test_mcc
|
|
value: [0.49629167 0.2085873 0.66143783 0.55814043 0.35714286 0.44883281
|
|
0.46291005 0.17173552 0.61237244 0.64285714]
|
|
|
|
mean value: 0.4620308033163919
|
|
|
|
key: train_mcc
|
|
value: [0.53438367 0.60265353 0.54665085 0.50633817 0.57652074 0.53118814
|
|
0.55255244 0.57652074 0.53762725 0.53762725]
|
|
|
|
mean value: 0.5502062784000151
|
|
|
|
key: test_accuracy
|
|
value: [0.86111111 0.77777778 0.88888889 0.86111111 0.77777778 0.83333333
|
|
0.80555556 0.75 0.88571429 0.88571429]
|
|
|
|
mean value: 0.8326984126984127
|
|
|
|
key: train_accuracy
|
|
value: [0.85714286 0.8757764 0.86024845 0.85093168 0.86956522 0.85714286
|
|
0.86335404 0.86956522 0.85758514 0.85758514]
|
|
|
|
mean value: 0.8618896986712306
|
|
|
|
key: test_fscore
|
|
value: [0.54545455 0.33333333 0.66666667 0.61538462 0.5 0.5
|
|
0.58823529 0.30769231 0.6 0.71428571]
|
|
|
|
mean value: 0.537105247693483
|
|
|
|
key: train_fscore
|
|
value: [0.60344828 0.66666667 0.62184874 0.57894737 0.6440678 0.60344828
|
|
0.62068966 0.6440678 0.61016949 0.61016949]
|
|
|
|
mean value: 0.6203523557751256
|
|
|
|
key: test_precision
|
|
value: [0.75 0.4 1. 0.8 0.5 0.75
|
|
0.55555556 0.4 1. 0.71428571]
|
|
|
|
mean value: 0.686984126984127
|
|
|
|
key: train_precision
|
|
value: [0.74468085 0.78431373 0.7254902 0.7173913 0.76 0.72916667
|
|
0.75 0.76 0.73469388 0.73469388]
|
|
|
|
mean value: 0.7440430498748991
|
|
|
|
key: test_recall
|
|
value: [0.42857143 0.28571429 0.5 0.5 0.5 0.375
|
|
0.625 0.25 0.42857143 0.71428571]
|
|
|
|
mean value: 0.4607142857142857
|
|
|
|
key: train_recall
|
|
value: [0.50724638 0.57971014 0.54411765 0.48529412 0.55882353 0.51470588
|
|
0.52941176 0.55882353 0.52173913 0.52173913]
|
|
|
|
mean value: 0.5321611253196931
|
|
|
|
key: test_roc_auc
|
|
value: [0.69704433 0.591133 0.75 0.73214286 0.67857143 0.66964286
|
|
0.74107143 0.57142857 0.71428571 0.82142857]
|
|
|
|
mean value: 0.6966748768472907
|
|
|
|
key: train_roc_auc
|
|
value: [0.72990777 0.76811594 0.74449977 0.71705651 0.75578972 0.73176239
|
|
0.74108384 0.75578972 0.73527901 0.73527901]
|
|
|
|
mean value: 0.7414563679569117
|
|
|
|
key: test_jcc
|
|
value: [0.375 0.2 0.5 0.44444444 0.33333333 0.33333333
|
|
0.41666667 0.18181818 0.42857143 0.55555556]
|
|
|
|
mean value: 0.37687229437229436
|
|
|
|
key: train_jcc
|
|
value: [0.43209877 0.5 0.45121951 0.40740741 0.475 0.43209877
|
|
0.45 0.475 0.43902439 0.43902439]
|
|
|
|
mean value: 0.4500873230954532
|
|
|
|
MCC on Blind test: 0.56
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01311135 0.01836658 0.01801801 0.02214932 0.0201776 0.02150178
|
|
0.03540277 0.02377319 0.01945472 0.02329373]
|
|
|
|
mean value: 0.021524906158447266
|
|
|
|
key: score_time
|
|
value: [0.00971651 0.01126003 0.01196456 0.01231813 0.01482916 0.0148766
|
|
0.01425123 0.01228762 0.01243329 0.01276159]
|
|
|
|
mean value: 0.012669873237609864
|
|
|
|
key: test_mcc
|
|
value: [0.85096294 0.72192954 0.44883281 0.91914503 0.41267736 0.31622777
|
|
0.37067856 0.6172134 0.61237244 0.72019314]
|
|
|
|
mean value: 0.5990232995602679
|
|
|
|
key: train_mcc
|
|
value: [0.84986344 0.90206627 0.77744561 0.94588078 0.86625969 0.8365424
|
|
0.93513953 0.94378174 0.81007791 0.92534731]
|
|
|
|
mean value: 0.8792404687832709
|
|
|
|
key: test_accuracy
|
|
value: [0.94444444 0.91666667 0.83333333 0.97222222 0.80555556 0.80555556
|
|
0.75 0.86111111 0.88571429 0.91428571]
|
|
|
|
mean value: 0.8688888888888889
|
|
|
|
key: train_accuracy
|
|
value: [0.95031056 0.96583851 0.92857143 0.98136646 0.95652174 0.94720497
|
|
0.97826087 0.98136646 0.9380805 0.9752322 ]
|
|
|
|
mean value: 0.9602753687287272
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.76923077 0.5 0.93333333 0.53333333 0.22222222
|
|
0.52631579 0.70588235 0.6 0.76923077]
|
|
|
|
mean value: 0.6434548569765288
|
|
|
|
key: train_fscore
|
|
value: [0.88059701 0.92307692 0.8 0.95714286 0.890625 0.864
|
|
0.94890511 0.95384615 0.83333333 0.94029851]
|
|
|
|
mean value: 0.8991824899276378
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.83333333 0.75 1. 0.57142857 1.
|
|
0.45454545 0.66666667 1. 0.83333333]
|
|
|
|
mean value: 0.7887085137085137
|
|
|
|
key: train_precision
|
|
value: [0.90769231 0.89189189 0.9787234 0.93055556 0.95 0.94736842
|
|
0.94202899 1. 0.98039216 0.96923077]
|
|
|
|
mean value: 0.9497883492048467
|
|
|
|
key: test_recall
|
|
value: [1. 0.71428571 0.375 0.875 0.5 0.125
|
|
0.625 0.75 0.42857143 0.71428571]
|
|
|
|
mean value: 0.6107142857142858
|
|
|
|
key: train_recall
|
|
value: [0.85507246 0.95652174 0.67647059 0.98529412 0.83823529 0.79411765
|
|
0.95588235 0.91176471 0.72463768 0.91304348]
|
|
|
|
mean value: 0.8611040068201193
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.83990148 0.66964286 0.9375 0.69642857 0.5625
|
|
0.70535714 0.82142857 0.71428571 0.83928571]
|
|
|
|
mean value: 0.7751847290640395
|
|
|
|
key: train_roc_auc
|
|
value: [0.91567852 0.96245059 0.83626679 0.98280454 0.91321214 0.89115331
|
|
0.97006716 0.95588235 0.86035034 0.95258473]
|
|
|
|
mean value: 0.9240450475107724
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.625 0.33333333 0.875 0.36363636 0.125
|
|
0.35714286 0.54545455 0.42857143 0.625 ]
|
|
|
|
mean value: 0.5055916305916306
|
|
|
|
key: train_jcc
|
|
value: [0.78666667 0.85714286 0.66666667 0.91780822 0.8028169 0.76056338
|
|
0.90277778 0.91176471 0.71428571 0.88732394]
|
|
|
|
mean value: 0.820781683295223
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01766181 0.01630974 0.0205574 0.01550651 0.01691341 0.01607966
|
|
0.01694655 0.01806903 0.01713586 0.01994872]
|
|
|
|
mean value: 0.017512869834899903
|
|
|
|
key: score_time
|
|
value: [0.0132618 0.01240635 0.01228547 0.0124495 0.01213622 0.01185417
|
|
0.01222014 0.01217818 0.01196337 0.01226044]
|
|
|
|
mean value: 0.01230156421661377
|
|
|
|
key: test_mcc
|
|
value: [0.58131836 0.34404556 0.67005939 0.75134288 0.51785714 0.5976143
|
|
0.50560765 0.51785714 0.35478744 0.61237244]
|
|
|
|
mean value: 0.5452862312081964
|
|
|
|
key: train_mcc
|
|
value: [0.77957604 0.49277338 0.66452587 0.84671817 0.85667348 0.43962631
|
|
0.72241165 0.87641313 0.69158946 0.89676152]
|
|
|
|
mean value: 0.7267069018892041
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.83333333 0.83333333 0.91666667 0.83333333 0.77777778
|
|
0.69444444 0.83333333 0.71428571 0.88571429]
|
|
|
|
mean value: 0.81
|
|
|
|
key: train_accuracy
|
|
value: [0.91614907 0.84782609 0.83850932 0.95031056 0.95341615 0.63043478
|
|
0.8757764 0.95962733 0.85139319 0.96594427]
|
|
|
|
mean value: 0.8789387150741304
|
|
|
|
key: test_fscore
|
|
value: [0.63636364 0.25 0.72727273 0.76923077 0.625 0.66666667
|
|
0.59259259 0.625 0.5 0.66666667]
|
|
|
|
mean value: 0.6058793058793058
|
|
|
|
key: train_fscore
|
|
value: [0.82580645 0.4494382 0.72043011 0.875 0.88372093 0.53333333
|
|
0.77011494 0.896 0.74193548 0.91603053]
|
|
|
|
mean value: 0.7611809985703716
|
|
|
|
key: test_precision
|
|
value: [0.46666667 1. 0.57142857 1. 0.625 0.5
|
|
0.42105263 0.625 0.38461538 0.8 ]
|
|
|
|
mean value: 0.6393763254289571
|
|
|
|
key: train_precision
|
|
value: [0.74418605 1. 0.56779661 0.93333333 0.93442623 0.36363636
|
|
0.63207547 0.98245614 0.58974359 0.96774194]
|
|
|
|
mean value: 0.7715395720435464
|
|
|
|
key: test_recall
|
|
value: [1. 0.14285714 1. 0.625 0.625 1.
|
|
1. 0.625 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7303571428571428
|
|
|
|
key: train_recall
|
|
value: [0.92753623 0.28985507 0.98529412 0.82352941 0.83823529 1.
|
|
0.98529412 0.82352941 1. 0.86956522]
|
|
|
|
mean value: 0.8542838874680307
|
|
|
|
key: test_roc_auc
|
|
value: [0.86206897 0.57142857 0.89285714 0.8125 0.75892857 0.85714286
|
|
0.80357143 0.75892857 0.71428571 0.76785714]
|
|
|
|
mean value: 0.7799568965517242
|
|
|
|
key: train_roc_auc
|
|
value: [0.92028986 0.64492754 0.89225336 0.90389069 0.91124363 0.76574803
|
|
0.91587541 0.9097962 0.90551181 0.9308456 ]
|
|
|
|
mean value: 0.8700382121352478
|
|
|
|
key: test_jcc
|
|
value: [0.46666667 0.14285714 0.57142857 0.625 0.45454545 0.5
|
|
0.42105263 0.45454545 0.33333333 0.5 ]
|
|
|
|
mean value: 0.4469429254955571
|
|
|
|
key: train_jcc
|
|
value: [0.7032967 0.28985507 0.56302521 0.77777778 0.79166667 0.36363636
|
|
0.62616822 0.8115942 0.58974359 0.84507042]
|
|
|
|
mean value: 0.6361834233401731
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15819955 0.14830995 0.14526677 0.13937092 0.14099097 0.13902879
|
|
0.1481483 0.1477356 0.14789319 0.14681125]
|
|
|
|
mean value: 0.14617552757263183
|
|
|
|
key: score_time
|
|
value: [0.01697183 0.01665115 0.01544809 0.01664472 0.01590967 0.0155158
|
|
0.01666117 0.0167737 0.01641393 0.01669216]
|
|
|
|
mean value: 0.0163682222366333
|
|
|
|
key: test_mcc
|
|
value: [1. 0.91914503 0.91914503 0.91914503 0.80582296 0.75134288
|
|
0.9258201 0.86189161 0.49391458 0.81649658]
|
|
|
|
mean value: 0.8412723806521708
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97222222 0.97222222 0.97222222 0.91666667 0.91666667
|
|
0.97222222 0.94444444 0.85714286 0.94285714]
|
|
|
|
mean value: 0.9466666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93333333 0.93333333 0.93333333 0.84210526 0.76923077
|
|
0.94117647 0.88888889 0.54545455 0.83333333]
|
|
|
|
mean value: 0.8620189270653666
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 1. 1. 0.72727273 1.
|
|
0.88888889 0.8 0.75 1. ]
|
|
|
|
mean value: 0.9041161616161616
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.875 0.875 1. 0.625
|
|
1. 1. 0.42857143 0.71428571]
|
|
|
|
mean value: 0.8517857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 0.9375 0.9375 0.94642857 0.8125
|
|
0.98214286 0.96428571 0.69642857 0.85714286]
|
|
|
|
mean value: 0.9116687192118227
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.875 0.875 0.875 0.72727273 0.625
|
|
0.88888889 0.8 0.375 0.71428571]
|
|
|
|
mean value: 0.775544733044733
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.98
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05724621 0.0500145 0.0438199 0.07104373 0.0596118 0.06134391
|
|
0.06512403 0.070961 0.0410428 0.05038881]
|
|
|
|
mean value: 0.05705966949462891
|
|
|
|
key: score_time
|
|
value: [0.02019477 0.02353168 0.0291779 0.04015303 0.0278511 0.03402472
|
|
0.0242095 0.04008889 0.02148032 0.02989411]
|
|
|
|
mean value: 0.02906060218811035
|
|
|
|
key: test_mcc
|
|
value: [1. 0.91914503 1. 0.91914503 0.80582296 0.83666003
|
|
0.77151675 0.86189161 0.49391458 0.81649658]
|
|
|
|
mean value: 0.8424592569278723
|
|
|
|
key: train_mcc
|
|
value: [0.98155468 0.98155468 0.97192696 0.99077106 0.99077106 0.9906716
|
|
0.97196923 0.97192696 0.96368577 1. ]
|
|
|
|
mean value: 0.981483200152367
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.97222222 1. 0.97222222 0.91666667 0.94444444
|
|
0.91666667 0.94444444 0.85714286 0.94285714]
|
|
|
|
mean value: 0.9466666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.99378882 0.99378882 0.99068323 0.99689441 0.99689441 0.99689441
|
|
0.99068323 0.99068323 0.9876161 1. ]
|
|
|
|
mean value: 0.9937926658077418
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93333333 1. 0.93333333 0.84210526 0.85714286
|
|
0.82352941 0.88888889 0.54545455 0.83333333]
|
|
|
|
mean value: 0.8657120966408892
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 0.98550725 0.97777778 0.99270073 0.99270073 0.99259259
|
|
0.97744361 0.97777778 0.97142857 1. ]
|
|
|
|
mean value: 0.9853436281206914
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 1. 1. 0.72727273 1.
|
|
0.77777778 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8930050505050505
|
|
|
|
key: train_precision
|
|
value: [0.98550725 0.98550725 0.98507463 0.98550725 0.98550725 1.
|
|
1. 0.98507463 0.95774648 1. ]
|
|
|
|
mean value: 0.9869924718111829
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.875 1. 0.75
|
|
0.875 1. 0.42857143 0.71428571]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_recall
|
|
value: [0.98550725 0.98550725 0.97058824 1. 1. 0.98529412
|
|
0.95588235 0.97058824 0.98550725 1. ]
|
|
|
|
mean value: 0.9838874680306906
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 1. 0.9375 0.94642857 0.875
|
|
0.90178571 0.96428571 0.69642857 0.85714286]
|
|
|
|
mean value: 0.9161330049261084
|
|
|
|
key: train_roc_auc
|
|
value: [0.99077734 0.99077734 0.98332561 0.9980315 0.9980315 0.99264706
|
|
0.97794118 0.98332561 0.98684811 1. ]
|
|
|
|
mean value: 0.9901705243424437
|
|
|
|
key: test_jcc
|
|
value: [1. 0.875 1. 0.875 0.72727273 0.75
|
|
0.7 0.8 0.375 0.71428571]
|
|
|
|
mean value: 0.7816558441558441
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.97142857 0.95652174 0.98550725 0.98550725 0.98529412
|
|
0.95588235 0.95652174 0.94444444 1. ]
|
|
|
|
mean value: 0.9712536028904315
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08166194 0.14968729 0.11179137 0.07614708 0.08787417 0.1285069
|
|
0.12228823 0.12631869 0.09615564 0.09222722]
|
|
|
|
mean value: 0.10726585388183593
|
|
|
|
key: score_time
|
|
value: [0.02172256 0.02620792 0.02364445 0.01462293 0.022928 0.02531552
|
|
0.02583742 0.02750945 0.02175593 0.02795911]
|
|
|
|
mean value: 0.02375032901763916
|
|
|
|
key: test_mcc
|
|
value: [ 0.1872493 0.1872493 -0.09035079 0. -0.12964074 0.
|
|
0.2362278 0.45374261 -0.08574929 0. ]
|
|
|
|
mean value: 0.07587281768515772
|
|
|
|
key: train_mcc
|
|
value: [0.91634855 0.93507164 0.93434457 0.90588785 0.89634849 0.91539921
|
|
0.93434457 0.90588785 0.90701894 0.91641052]
|
|
|
|
mean value: 0.9167062185712945
|
|
|
|
key: test_accuracy
|
|
value: [0.80555556 0.80555556 0.75 0.77777778 0.72222222 0.77777778
|
|
0.77777778 0.83333333 0.77142857 0.8 ]
|
|
|
|
mean value: 0.7821428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.97204969 0.97826087 0.97826087 0.9689441 0.96583851 0.97204969
|
|
0.97826087 0.9689441 0.96904025 0.97213622]
|
|
|
|
mean value: 0.972378516624041
|
|
|
|
key: test_fscore
|
|
value: [0.22222222 0.22222222 0. 0. 0. 0.
|
|
0.33333333 0.4 0. 0. ]
|
|
|
|
mean value: 0.11777777777777779
|
|
|
|
key: train_fscore
|
|
value: [0.93023256 0.94656489 0.94573643 0.92063492 0.912 0.92913386
|
|
0.94573643 0.92063492 0.921875 0.93023256]
|
|
|
|
mean value: 0.9302781569529865
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 0. 0. 0. 0. 0.5 1. 0. 0. ]
|
|
|
|
mean value: 0.25
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.14285714 0.14285714 0. 0. 0. 0.
|
|
0.25 0.25 0. 0. ]
|
|
|
|
mean value: 0.07857142857142857
|
|
|
|
key: train_recall
|
|
value: [0.86956522 0.89855072 0.89705882 0.85294118 0.83823529 0.86764706
|
|
0.89705882 0.85294118 0.85507246 0.86956522]
|
|
|
|
mean value: 0.8698635976129583
|
|
|
|
key: test_roc_auc
|
|
value: [0.55418719 0.55418719 0.48214286 0.5 0.46428571 0.5
|
|
0.58928571 0.625 0.48214286 0.5 ]
|
|
|
|
mean value: 0.5251231527093596
|
|
|
|
key: train_roc_auc
|
|
value: [0.93478261 0.94927536 0.94852941 0.92647059 0.91911765 0.93382353
|
|
0.94852941 0.92647059 0.92753623 0.93478261]
|
|
|
|
mean value: 0.9349317988064791
|
|
|
|
key: test_jcc
|
|
value: [0.125 0.125 0. 0. 0. 0. 0.2 0.25 0. 0. ]
|
|
|
|
mean value: 0.07
|
|
|
|
key: train_jcc
|
|
value: [0.86956522 0.89855072 0.89705882 0.85294118 0.83823529 0.86764706
|
|
0.89705882 0.85294118 0.85507246 0.86956522]
|
|
|
|
mean value: 0.8698635976129583
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.52069354 0.50328159 0.50455189 0.50025082 0.50435662 0.50333905
|
|
0.49607444 0.50226235 0.50641394 0.49668527]
|
|
|
|
mean value: 0.5037909507751465
|
|
|
|
key: score_time
|
|
value: [0.00986624 0.00982571 0.01003551 0.00934696 0.00967073 0.00945807
|
|
0.0095005 0.00956893 0.01015067 0.01010966]
|
|
|
|
mean value: 0.00975329875946045
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8226601 1. 0.91914503 0.71098137 0.75134288
|
|
0.86189161 0.86189161 0.61237244 1. ]
|
|
|
|
mean value: 0.8540285030671064
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94444444 1. 0.97222222 0.86111111 0.91666667
|
|
0.94444444 0.94444444 0.88571429 1. ]
|
|
|
|
mean value: 0.9469047619047619
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.85714286 1. 0.93333333 0.76190476 0.76923077
|
|
0.88888889 0.88888889 0.66666667 1. ]
|
|
|
|
mean value: 0.8766056166056166
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.85714286 1. 1. 0.61538462 1.
|
|
0.8 0.8 0.8 1. ]
|
|
|
|
mean value: 0.8872527472527473
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.875 1. 0.625
|
|
1. 1. 0.57142857 1. ]
|
|
|
|
mean value: 0.8928571428571428
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.91133005 1. 0.9375 0.91071429 0.8125
|
|
0.96428571 0.96428571 0.76785714 1. ]
|
|
|
|
mean value: 0.9268472906403941
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.75 1. 0.875 0.61538462 0.625
|
|
0.8 0.8 0.5 1. ]
|
|
|
|
mean value: 0.7965384615384615
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.91
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02580166 0.02822471 0.02490973 0.04556131 0.0245204 0.02402449
|
|
0.02442002 0.02469778 0.02477241 0.02475023]
|
|
|
|
mean value: 0.02716827392578125
|
|
|
|
key: score_time
|
|
value: [0.01345968 0.01275158 0.01257443 0.01375508 0.01480269 0.01374626
|
|
0.01462483 0.01372385 0.01461887 0.0150516 ]
|
|
|
|
mean value: 0.013910889625549316
|
|
|
|
key: test_mcc
|
|
value: [-0.08304548 -0.11915865 -0.16116459 -0.23904572 -0.12964074 -0.3086067
|
|
0.32232919 -0.12964074 -0.15309311 0.15161961]
|
|
|
|
mean value: -0.08494469446571944
|
|
|
|
key: train_mcc
|
|
value: [0.24048671 0.24048671 0.28810855 0.21675985 0.28810855 0.2663143
|
|
0.18742507 0.24272682 0.24058235 0.15144495]
|
|
|
|
mean value: 0.23624438625695043
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.75 0.69444444 0.61111111 0.72222222 0.52777778
|
|
0.80555556 0.72222222 0.71428571 0.74285714]
|
|
|
|
mean value: 0.7068253968253968
|
|
|
|
key: train_accuracy
|
|
value: [0.80124224 0.80124224 0.81055901 0.80124224 0.81055901 0.80745342
|
|
0.79813665 0.80434783 0.80185759 0.79256966]
|
|
|
|
mean value: 0.8029209853277696
|
|
|
|
key: test_fscore
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0.36363636 0. 0. 0.30769231]
|
|
|
|
mean value: 0.06713286713286713
|
|
|
|
key: train_fscore
|
|
value: [0.13513514 0.13513514 0.18666667 0.11111111 0.18666667 0.16216216
|
|
0.08450704 0.1369863 0.13513514 0.05633803]
|
|
|
|
mean value: 0.13298433838044102
|
|
|
|
key: test_precision
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0.66666667 0. 0. 0.33333333]
|
|
|
|
mean value: 0.09999999999999999
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0.25 0. 0. 0.28571429]
|
|
|
|
mean value: 0.05357142857142857
|
|
|
|
key: train_recall
|
|
value: [0.07246377 0.07246377 0.10294118 0.05882353 0.10294118 0.08823529
|
|
0.04411765 0.07352941 0.07246377 0.02898551]
|
|
|
|
mean value: 0.07169650468883206
|
|
|
|
key: test_roc_auc
|
|
value: [0.48275862 0.46551724 0.44642857 0.39285714 0.46428571 0.33928571
|
|
0.60714286 0.46428571 0.44642857 0.57142857]
|
|
|
|
mean value: 0.4680418719211823
|
|
|
|
key: train_roc_auc
|
|
value: [0.53623188 0.53623188 0.55147059 0.52941176 0.55147059 0.54411765
|
|
0.52205882 0.53676471 0.53623188 0.51449275]
|
|
|
|
mean value: 0.535848252344416
|
|
|
|
key: test_jcc
|
|
value: [0. 0. 0. 0. 0. 0.
|
|
0.22222222 0. 0. 0.18181818]
|
|
|
|
mean value: 0.0404040404040404
|
|
|
|
key: train_jcc
|
|
value: [0.07246377 0.07246377 0.10294118 0.05882353 0.10294118 0.08823529
|
|
0.04411765 0.07352941 0.07246377 0.02898551]
|
|
|
|
mean value: 0.07169650468883206
|
|
|
|
MCC on Blind test: 0.02
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02734208 0.03658342 0.03627729 0.03802681 0.03607988 0.03623199
|
|
0.07836962 0.0335741 0.04661655 0.03537488]
|
|
|
|
mean value: 0.040447664260864255
|
|
|
|
key: score_time
|
|
value: [0.02119327 0.02037835 0.02254105 0.02403259 0.02666831 0.02385139
|
|
0.02655315 0.0224328 0.02348709 0.02180433]
|
|
|
|
mean value: 0.023294234275817872
|
|
|
|
key: test_mcc
|
|
value: [1. 0.6144869 0.55814043 0.75134288 0.46291005 0.66143783
|
|
0.67857143 0.77151675 0.71842121 0.7484552 ]
|
|
|
|
mean value: 0.6965282676681155
|
|
|
|
key: train_mcc
|
|
value: [0.89727565 0.88757529 0.87718604 0.86725712 0.88588911 0.88708251
|
|
0.90539133 0.91509932 0.88839586 0.89735962]
|
|
|
|
mean value: 0.89085118703998
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.88888889 0.86111111 0.91666667 0.80555556 0.88888889
|
|
0.88888889 0.91666667 0.91428571 0.91428571]
|
|
|
|
mean value: 0.8995238095238095
|
|
|
|
key: train_accuracy
|
|
value: [0.96583851 0.96273292 0.95962733 0.95652174 0.96273292 0.96273292
|
|
0.9689441 0.97204969 0.9628483 0.96594427]
|
|
|
|
mean value: 0.9639972693883045
|
|
|
|
key: test_fscore
|
|
value: [1. 0.66666667 0.61538462 0.76923077 0.58823529 0.66666667
|
|
0.75 0.82352941 0.72727273 0.8 ]
|
|
|
|
mean value: 0.7406986151103798
|
|
|
|
key: train_fscore
|
|
value: [0.91851852 0.91044776 0.90225564 0.89393939 0.90769231 0.91044776
|
|
0.92424242 0.93233083 0.91176471 0.91851852]
|
|
|
|
mean value: 0.9130157857346989
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.8 1. 0.55555556 1.
|
|
0.75 0.77777778 1. 0.75 ]
|
|
|
|
mean value: 0.8433333333333334
|
|
|
|
key: train_precision
|
|
value: [0.93939394 0.93846154 0.92307692 0.921875 0.9516129 0.92424242
|
|
0.953125 0.95384615 0.92537313 0.93939394]
|
|
|
|
mean value: 0.9370400955969084
|
|
|
|
key: test_recall
|
|
value: [1. 0.57142857 0.5 0.625 0.625 0.5
|
|
0.75 0.875 0.57142857 0.85714286]
|
|
|
|
mean value: 0.6875
|
|
|
|
key: train_recall
|
|
value: [0.89855072 0.88405797 0.88235294 0.86764706 0.86764706 0.89705882
|
|
0.89705882 0.91176471 0.89855072 0.89855072]
|
|
|
|
mean value: 0.8903239556692242
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.76847291 0.73214286 0.8125 0.74107143 0.75
|
|
0.83928571 0.90178571 0.78571429 0.89285714]
|
|
|
|
mean value: 0.8223830049261084
|
|
|
|
key: train_roc_auc
|
|
value: [0.94137022 0.93412385 0.93133395 0.92398101 0.92791802 0.93868689
|
|
0.9426239 0.94997684 0.93943284 0.94140135]
|
|
|
|
mean value: 0.9370848871745019
|
|
|
|
key: test_jcc
|
|
value: [1. 0.5 0.44444444 0.625 0.41666667 0.5
|
|
0.6 0.7 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6024206349206349
|
|
|
|
key: train_jcc
|
|
value: [0.84931507 0.83561644 0.82191781 0.80821918 0.83098592 0.83561644
|
|
0.85915493 0.87323944 0.83783784 0.84931507]
|
|
|
|
mean value: 0.8401218119527979
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.24777889 0.24893379 0.25687099 0.27457643 0.29407358 0.27596569
|
|
0.33060312 0.30342221 0.28261089 0.29954863]
|
|
|
|
mean value: 0.281438422203064
|
|
|
|
key: score_time
|
|
value: [0.025563 0.02152777 0.02327323 0.02313852 0.02621269 0.02568793
|
|
0.02634549 0.02385783 0.03024006 0.02442074]
|
|
|
|
mean value: 0.025026726722717284
|
|
|
|
key: test_mcc
|
|
value: [1. 0.6144869 0.55814043 0.75032247 0.46291005 0.66143783
|
|
0.67857143 0.77151675 0.71842121 0.7484552 ]
|
|
|
|
mean value: 0.6964262266368373
|
|
|
|
key: train_mcc
|
|
value: [0.9358192 0.88757529 0.87718604 0.93513953 0.88588911 0.88708251
|
|
0.94407133 0.94407133 0.88839586 0.89735962]
|
|
|
|
mean value: 0.9082589838760209
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.88888889 0.86111111 0.91666667 0.80555556 0.88888889
|
|
0.88888889 0.91666667 0.91428571 0.91428571]
|
|
|
|
mean value: 0.8995238095238095
|
|
|
|
key: train_accuracy
|
|
value: [0.97826087 0.96273292 0.95962733 0.97826087 0.96273292 0.96273292
|
|
0.98136646 0.98136646 0.9628483 0.96594427]
|
|
|
|
mean value: 0.9695873315001058
|
|
|
|
key: test_fscore
|
|
value: [1. 0.66666667 0.61538462 0.8 0.58823529 0.66666667
|
|
0.75 0.82352941 0.72727273 0.8 ]
|
|
|
|
mean value: 0.7437755381873029
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:107: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:110: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.94964029 0.91044776 0.90225564 0.94890511 0.90769231 0.91044776
|
|
0.95588235 0.95588235 0.91176471 0.91851852]
|
|
|
|
mean value: 0.9271436796720172
|
|
|
|
key: test_precision
|
|
value: [1. 0.8 0.8 0.85714286 0.55555556 1.
|
|
0.75 0.77777778 1. 0.75 ]
|
|
|
|
mean value: 0.829047619047619
|
|
|
|
key: train_precision
|
|
value: [0.94285714 0.93846154 0.92307692 0.94202899 0.9516129 0.92424242
|
|
0.95588235 0.95588235 0.92537313 0.93939394]
|
|
|
|
mean value: 0.9398811696975732
|
|
|
|
key: test_recall
|
|
value: [1. 0.57142857 0.5 0.75 0.625 0.5
|
|
0.75 0.875 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.95652174 0.88405797 0.88235294 0.95588235 0.86764706 0.89705882
|
|
0.95588235 0.95588235 0.89855072 0.89855072]
|
|
|
|
mean value: 0.9152387041773231
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.76847291 0.73214286 0.85714286 0.74107143 0.75
|
|
0.83928571 0.90178571 0.78571429 0.89285714]
|
|
|
|
mean value: 0.8268472906403941
|
|
|
|
key: train_roc_auc
|
|
value: [0.97035573 0.93412385 0.93133395 0.97006716 0.92791802 0.93868689
|
|
0.97203566 0.97203566 0.93943284 0.94140135]
|
|
|
|
mean value: 0.9497391118222522
|
|
|
|
key: test_jcc
|
|
value: [1. 0.5 0.44444444 0.66666667 0.41666667 0.5
|
|
0.6 0.7 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6065873015873016
|
|
|
|
key: train_jcc
|
|
value: [0.90410959 0.83561644 0.82191781 0.90277778 0.83098592 0.83561644
|
|
0.91549296 0.91549296 0.83783784 0.84931507]
|
|
|
|
mean value: 0.8649162789067284
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03773212 0.03759193 0.03757524 0.03828263 0.0400517 0.03780508
|
|
0.03701949 0.04745054 0.04011869 0.03937149]
|
|
|
|
mean value: 0.03929989337921143
|
|
|
|
key: score_time
|
|
value: [0.01247263 0.01345062 0.01343799 0.01589584 0.01312828 0.01231027
|
|
0.01372886 0.01359797 0.01646304 0.01724863]
|
|
|
|
mean value: 0.014173412322998047
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.75492611 0.75492611 0.92980296 0.75434227 0.89802651
|
|
0.75434227 0.82618439 0.85933785 0.93094934]
|
|
|
|
mean value: 0.8292262532717305
|
|
|
|
key: train_mcc
|
|
value: [0.9094503 0.921366 0.90138807 0.89754406 0.9212884 0.92916266
|
|
0.91738682 0.90945587 0.91732994 0.90562412]
|
|
|
|
mean value: 0.9129996244909306
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.87719298 0.87719298 0.96491228 0.875 0.94642857
|
|
0.875 0.91071429 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9131578947368421
|
|
|
|
key: train_accuracy
|
|
value: [0.95463511 0.96055227 0.95069034 0.94871795 0.96062992 0.96456693
|
|
0.95866142 0.95472441 0.95866142 0.95275591]
|
|
|
|
mean value: 0.9564595660749506
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.87719298 0.87719298 0.96551724 0.88135593 0.94339623
|
|
0.88135593 0.91525424 0.92592593 0.96551724]
|
|
|
|
mean value: 0.9147962938994972
|
|
|
|
key: train_fscore
|
|
value: [0.95427435 0.96015936 0.95069034 0.94820717 0.96047431 0.96442688
|
|
0.95841584 0.95463511 0.95857988 0.95238095]
|
|
|
|
mean value: 0.956224419292093
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.86206897 0.89285714 0.96551724 0.83870968 1.
|
|
0.83870968 0.87096774 0.96153846 0.93333333]
|
|
|
|
mean value: 0.9034669983335167
|
|
|
|
key: train_precision
|
|
value: [0.96385542 0.97177419 0.9488189 0.95582329 0.96428571 0.96825397
|
|
0.96414343 0.95652174 0.96047431 0.96 ]
|
|
|
|
mean value: 0.9613950962310953
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.89285714 0.86206897 0.96551724 0.92857143 0.89285714
|
|
0.92857143 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9291871921182266
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.9488189 0.95256917 0.94071146 0.95669291 0.96062992
|
|
0.95275591 0.95275591 0.95669291 0.94488189]
|
|
|
|
mean value: 0.951139086863154
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.87746305 0.87746305 0.96490148 0.875 0.94642857
|
|
0.875 0.91071429 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9133004926108375
|
|
|
|
key: train_roc_auc
|
|
value: [0.95465438 0.96057546 0.95069403 0.94870219 0.96062992 0.96456693
|
|
0.95866142 0.95472441 0.95866142 0.95275591]
|
|
|
|
mean value: 0.9564626062058448
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.78125 0.78125 0.93333333 0.78787879 0.89285714
|
|
0.78787879 0.84375 0.86206897 0.93333333]
|
|
|
|
mean value: 0.8447350350798627
|
|
|
|
key: train_jcc
|
|
value: [0.91254753 0.92337165 0.90601504 0.90151515 0.92395437 0.93129771
|
|
0.92015209 0.91320755 0.92045455 0.90909091]
|
|
|
|
mean value: 0.9161606540653082
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.94869375 0.87011933 1.19105005 0.99319768 0.97389174 0.92805409
|
|
0.93025017 0.93565726 0.83603239 1.05092835]
|
|
|
|
mean value: 0.9657874822616577
|
|
|
|
key: score_time
|
|
value: [0.01407504 0.01457548 0.01895332 0.02009797 0.01391673 0.01355195
|
|
0.01422977 0.01349163 0.01395845 0.01340699]
|
|
|
|
mean value: 0.015025734901428223
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.8951918 0.86189955 0.96547546 0.85933785 0.93094934
|
|
0.93094934 0.93094934 0.89342711 0.93094934]
|
|
|
|
mean value: 0.9028553849533496
|
|
|
|
key: train_mcc
|
|
value: [0.98817342 0.98028353 0.98817323 0.98817323 0.99607071 0.99212598
|
|
0.98819663 1. 0.99212598 0.98032256]
|
|
|
|
mean value: 0.9893645282093079
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.94736842 0.92982456 0.98245614 0.92857143 0.96428571
|
|
0.96428571 0.96428571 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9504072681704261
|
|
|
|
key: train_accuracy
|
|
value: [0.99408284 0.99013807 0.99408284 0.99408284 0.9980315 0.99606299
|
|
0.99409449 1. 0.99606299 0.99015748]
|
|
|
|
mean value: 0.9946796036590101
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.94545455 0.92857143 0.98305085 0.93103448 0.96296296
|
|
0.96551724 0.96551724 0.94736842 0.96551724]
|
|
|
|
mean value: 0.9510248649683883
|
|
|
|
key: train_fscore
|
|
value: [0.99408284 0.99017682 0.99405941 0.99405941 0.99802761 0.99606299
|
|
0.99408284 1. 0.99606299 0.99017682]
|
|
|
|
mean value: 0.9946791724596361
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.96296296 0.96296296 0.96666667 0.9 1.
|
|
0.93333333 0.93333333 0.93103448 0.93333333]
|
|
|
|
mean value: 0.9394594817286697
|
|
|
|
key: train_precision
|
|
value: [0.99604743 0.98823529 0.99603175 0.99603175 1. 0.99606299
|
|
0.99604743 1. 0.99606299 0.98823529]
|
|
|
|
mean value: 0.9952754926210834
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.89655172 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9646551724137931
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.99212598 0.99209486 0.99209486 0.99606299 0.99606299
|
|
0.99212598 1. 0.99606299 0.99212598]
|
|
|
|
mean value: 0.9940882636705985
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.94704433 0.93041872 0.98214286 0.92857143 0.96428571
|
|
0.96428571 0.96428571 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9504926108374385
|
|
|
|
key: train_roc_auc
|
|
value: [0.99408671 0.99013414 0.99407893 0.99407893 0.9980315 0.99606299
|
|
0.99409449 1. 0.99606299 0.99015748]
|
|
|
|
mean value: 0.9946788148517008
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.89655172 0.86666667 0.96666667 0.87096774 0.92857143
|
|
0.93333333 0.93333333 0.9 0.93333333]
|
|
|
|
mean value: 0.9073174227978177
|
|
|
|
key: train_jcc
|
|
value: [0.98823529 0.98054475 0.98818898 0.98818898 0.99606299 0.99215686
|
|
0.98823529 1. 0.99215686 0.98054475]
|
|
|
|
mean value: 0.9894314752770804
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01455855 0.01241016 0.01061916 0.01032066 0.01021433 0.00997758
|
|
0.01221704 0.01006413 0.01007271 0.01009989]
|
|
|
|
mean value: 0.011055421829223634
|
|
|
|
key: score_time
|
|
value: [0.01239634 0.00947404 0.00947428 0.00910234 0.00912118 0.00902677
|
|
0.00919771 0.00907707 0.00964546 0.00893784]
|
|
|
|
mean value: 0.009545302391052246
|
|
|
|
key: test_mcc
|
|
value: [0.47713554 0.553659 0.75462449 0.54592083 0.60753044 0.64285714
|
|
0.5118907 0.67082039 0.39310793 0.5118907 ]
|
|
|
|
mean value: 0.5669437167779643
|
|
|
|
key: train_mcc
|
|
value: [0.61830137 0.59162207 0.59295071 0.61416745 0.67031032 0.64585416
|
|
0.65661014 0.65225378 0.62955117 0.6032316 ]
|
|
|
|
mean value: 0.6274852770592964
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.77192982 0.87719298 0.77192982 0.80357143 0.82142857
|
|
0.75 0.82142857 0.69642857 0.75 ]
|
|
|
|
mean value: 0.7800751879699248
|
|
|
|
key: train_accuracy
|
|
value: [0.80473373 0.78303748 0.79487179 0.80473373 0.83464567 0.81889764
|
|
0.82480315 0.82283465 0.81299213 0.7992126 ]
|
|
|
|
mean value: 0.8100762552609918
|
|
|
|
key: test_fscore
|
|
value: [0.74576271 0.78688525 0.88135593 0.78688525 0.8 0.82142857
|
|
0.77419355 0.84375 0.70175439 0.77419355]
|
|
|
|
mean value: 0.7916209190038752
|
|
|
|
key: train_fscore
|
|
value: [0.82032668 0.81099656 0.80451128 0.81564246 0.82995951 0.83211679
|
|
0.83669725 0.83455882 0.82242991 0.81111111]
|
|
|
|
mean value: 0.821835037001602
|
|
|
|
key: test_precision
|
|
value: [0.70967742 0.72727273 0.86666667 0.75 0.81481481 0.82142857
|
|
0.70588235 0.75 0.68965517 0.70588235]
|
|
|
|
mean value: 0.7541280077833765
|
|
|
|
key: train_precision
|
|
value: [0.76094276 0.7195122 0.76702509 0.77112676 0.85416667 0.7755102
|
|
0.78350515 0.78275862 0.78291815 0.76573427]
|
|
|
|
mean value: 0.7763199867511414
|
|
|
|
key: test_recall
|
|
value: [0.78571429 0.85714286 0.89655172 0.82758621 0.78571429 0.82142857
|
|
0.85714286 0.96428571 0.71428571 0.85714286]
|
|
|
|
mean value: 0.8366995073891625
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.92913386 0.8458498 0.86561265 0.80708661 0.8976378
|
|
0.8976378 0.89370079 0.86614173 0.86220472]
|
|
|
|
mean value: 0.8754769537207059
|
|
|
|
key: test_roc_auc
|
|
value: [0.73768473 0.77339901 0.87684729 0.77093596 0.80357143 0.82142857
|
|
0.75 0.82142857 0.69642857 0.75 ]
|
|
|
|
mean value: 0.7801724137931034
|
|
|
|
key: train_roc_auc
|
|
value: [0.80456568 0.78274875 0.79497215 0.80485357 0.83464567 0.81889764
|
|
0.82480315 0.82283465 0.81299213 0.7992126 ]
|
|
|
|
mean value: 0.8100525971802932
|
|
|
|
key: test_jcc
|
|
value: [0.59459459 0.64864865 0.78787879 0.64864865 0.66666667 0.6969697
|
|
0.63157895 0.72972973 0.54054054 0.63157895]
|
|
|
|
mean value: 0.6576835208414156
|
|
|
|
key: train_jcc
|
|
value: [0.69538462 0.68208092 0.67295597 0.68867925 0.70934256 0.7125
|
|
0.7192429 0.71608833 0.6984127 0.68224299]
|
|
|
|
mean value: 0.6976930240270341
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01165748 0.01036167 0.0107286 0.0104363 0.01048374 0.01035953
|
|
0.0104208 0.01058149 0.01036954 0.01047564]
|
|
|
|
mean value: 0.010587477684020996
|
|
|
|
key: score_time
|
|
value: [0.00932479 0.00897956 0.00896621 0.00899768 0.00909901 0.00911689
|
|
0.00907898 0.00910473 0.00895834 0.00902915]
|
|
|
|
mean value: 0.009065532684326172
|
|
|
|
key: test_mcc
|
|
value: [0.58562417 0.62473685 0.50927421 0.57973205 0.60753044 0.71428571
|
|
0.64951905 0.72168784 0.67900461 0.39310793]
|
|
|
|
mean value: 0.6064502839116733
|
|
|
|
key: train_mcc
|
|
value: [0.63864108 0.67343572 0.67495523 0.65362362 0.63188315 0.6387663
|
|
0.65228602 0.64665231 0.67097829 0.6472967 ]
|
|
|
|
mean value: 0.6528518419173023
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.80701754 0.75438596 0.78947368 0.80357143 0.85714286
|
|
0.82142857 0.85714286 0.83928571 0.69642857]
|
|
|
|
mean value: 0.8015350877192983
|
|
|
|
key: train_accuracy
|
|
value: [0.81854043 0.83629191 0.83629191 0.82642998 0.81496063 0.81889764
|
|
0.82480315 0.82283465 0.83464567 0.82283465]
|
|
|
|
mean value: 0.8256530618583919
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.81967213 0.76666667 0.8 0.80701754 0.85714286
|
|
0.83333333 0.86666667 0.83636364 0.69090909]
|
|
|
|
mean value: 0.8077771926089441
|
|
|
|
key: train_fscore
|
|
value: [0.82509506 0.84069098 0.84250474 0.83011583 0.8219697 0.82375479
|
|
0.83239171 0.82758621 0.84030418 0.82889734]
|
|
|
|
mean value: 0.8313310537668297
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75757576 0.74193548 0.77419355 0.79310345 0.85714286
|
|
0.78125 0.8125 0.85185185 0.7037037 ]
|
|
|
|
mean value: 0.7823256650808097
|
|
|
|
key: train_precision
|
|
value: [0.79779412 0.82022472 0.81021898 0.81132075 0.7919708 0.80223881
|
|
0.79783394 0.80597015 0.8125 0.80147059]
|
|
|
|
mean value: 0.8051542850964287
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.89285714 0.79310345 0.82758621 0.82142857 0.85714286
|
|
0.89285714 0.92857143 0.82142857 0.67857143]
|
|
|
|
mean value: 0.8370689655172414
|
|
|
|
key: train_recall
|
|
value: [0.85433071 0.86220472 0.87747036 0.84980237 0.85433071 0.84645669
|
|
0.87007874 0.8503937 0.87007874 0.85826772]
|
|
|
|
mean value: 0.8593414459556192
|
|
|
|
key: test_roc_auc
|
|
value: [0.79064039 0.80849754 0.75369458 0.7887931 0.80357143 0.85714286
|
|
0.82142857 0.85714286 0.83928571 0.69642857]
|
|
|
|
mean value: 0.8016625615763546
|
|
|
|
key: train_roc_auc
|
|
value: [0.8184697 0.8362407 0.83637297 0.82647599 0.81496063 0.81889764
|
|
0.82480315 0.82283465 0.83464567 0.82283465]
|
|
|
|
mean value: 0.8256535744296786
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.69444444 0.62162162 0.66666667 0.67647059 0.75
|
|
0.71428571 0.76470588 0.71875 0.52777778]
|
|
|
|
mean value: 0.6801389362051127
|
|
|
|
key: train_jcc
|
|
value: [0.70226537 0.72516556 0.72786885 0.70957096 0.6977492 0.70032573
|
|
0.71290323 0.70588235 0.72459016 0.70779221]
|
|
|
|
mean value: 0.7114113624151682
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00955963 0.01076388 0.0108633 0.01068926 0.01065207 0.01071215
|
|
0.00975442 0.01087141 0.01065636 0.01067686]
|
|
|
|
mean value: 0.010519933700561524
|
|
|
|
key: score_time
|
|
value: [0.01791644 0.01392055 0.01368237 0.01361632 0.01391411 0.01670647
|
|
0.01601553 0.01230955 0.01352668 0.01577139]
|
|
|
|
mean value: 0.014737939834594727
|
|
|
|
key: test_mcc
|
|
value: [0.62473685 0.82490815 0.47519927 0.65018988 0.72168784 0.72168784
|
|
0.65814518 0.8660254 0.71611487 0.68250015]
|
|
|
|
mean value: 0.6941195430259357
|
|
|
|
key: train_mcc
|
|
value: [0.82324487 0.81065015 0.84223222 0.79510329 0.80709287 0.80337378
|
|
0.79936749 0.79163927 0.81142619 0.83148876]
|
|
|
|
mean value: 0.8115618890001493
|
|
|
|
key: test_accuracy
|
|
value: [0.80701754 0.9122807 0.73684211 0.8245614 0.85714286 0.85714286
|
|
0.82142857 0.92857143 0.85714286 0.83928571]
|
|
|
|
mean value: 0.844141604010025
|
|
|
|
key: train_accuracy
|
|
value: [0.9112426 0.90532544 0.92110454 0.8974359 0.90354331 0.9015748
|
|
0.8996063 0.89566929 0.90551181 0.91535433]
|
|
|
|
mean value: 0.9056368323782013
|
|
|
|
key: test_fscore
|
|
value: [0.81967213 0.90909091 0.75409836 0.83333333 0.86666667 0.86666667
|
|
0.83870968 0.93333333 0.85185185 0.83018868]
|
|
|
|
mean value: 0.8503611609410677
|
|
|
|
key: train_fscore
|
|
value: [0.9132948 0.90551181 0.92063492 0.8984375 0.90335306 0.90272374
|
|
0.9005848 0.89708738 0.90697674 0.91714836]
|
|
|
|
mean value: 0.9065753102337704
|
|
|
|
key: test_precision
|
|
value: [0.75757576 0.92592593 0.71875 0.80645161 0.8125 0.8125
|
|
0.76470588 0.875 0.88461538 0.88 ]
|
|
|
|
mean value: 0.8238024563373235
|
|
|
|
key: train_precision
|
|
value: [0.89433962 0.90551181 0.92430279 0.88803089 0.90513834 0.89230769
|
|
0.89189189 0.88505747 0.89312977 0.89811321]
|
|
|
|
mean value: 0.8977823484465078
|
|
|
|
key: test_recall
|
|
value: [0.89285714 0.89285714 0.79310345 0.86206897 0.92857143 0.92857143
|
|
0.92857143 1. 0.82142857 0.78571429]
|
|
|
|
mean value: 0.8833743842364532
|
|
|
|
key: train_recall
|
|
value: [0.93307087 0.90551181 0.91699605 0.90909091 0.9015748 0.91338583
|
|
0.90944882 0.90944882 0.92125984 0.93700787]
|
|
|
|
mean value: 0.9156795617939062
|
|
|
|
key: test_roc_auc
|
|
value: [0.80849754 0.91194581 0.73583744 0.82389163 0.85714286 0.85714286
|
|
0.82142857 0.92857143 0.85714286 0.83928571]
|
|
|
|
mean value: 0.844088669950739
|
|
|
|
key: train_roc_auc
|
|
value: [0.91119946 0.90532508 0.92109645 0.89745884 0.90354331 0.9015748
|
|
0.8996063 0.89566929 0.90551181 0.91535433]
|
|
|
|
mean value: 0.9056339671967881
|
|
|
|
key: test_jcc
|
|
value: [0.69444444 0.83333333 0.60526316 0.71428571 0.76470588 0.76470588
|
|
0.72222222 0.875 0.74193548 0.70967742]
|
|
|
|
mean value: 0.742557354011214
|
|
|
|
key: train_jcc
|
|
value: [0.84042553 0.82733813 0.85294118 0.81560284 0.82374101 0.82269504
|
|
0.81914894 0.81338028 0.82978723 0.84697509]
|
|
|
|
mean value: 0.8292035258287433
|
|
|
|
MCC on Blind test: 0.4
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.021698 0.0257144 0.02186465 0.02199745 0.02193356 0.02225208
|
|
0.02210999 0.02229166 0.02531195 0.02225661]
|
|
|
|
mean value: 0.022743034362792968
|
|
|
|
key: score_time
|
|
value: [0.01183033 0.01237917 0.01338649 0.01215816 0.01205087 0.01202941
|
|
0.01245189 0.01203823 0.01343703 0.01191998]
|
|
|
|
mean value: 0.012368154525756837
|
|
|
|
key: test_mcc
|
|
value: [0.79778885 0.68472906 0.68736396 0.9321832 0.75047877 0.82195294
|
|
0.73127242 0.73127242 0.75434227 0.85933785]
|
|
|
|
mean value: 0.7750721760268345
|
|
|
|
key: train_mcc
|
|
value: [0.83878121 0.84285233 0.85486038 0.8349816 0.8355787 0.84662074
|
|
0.83123063 0.84662074 0.83630655 0.8543903 ]
|
|
|
|
mean value: 0.8422223192370651
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.84210526 0.84210526 0.96491228 0.875 0.91071429
|
|
0.85714286 0.85714286 0.875 0.92857143]
|
|
|
|
mean value: 0.8847431077694236
|
|
|
|
key: train_accuracy
|
|
value: [0.91913215 0.92110454 0.9270217 0.91715976 0.91732283 0.92322835
|
|
0.91535433 0.92322835 0.91732283 0.92716535]
|
|
|
|
mean value: 0.9208040193200702
|
|
|
|
key: test_fscore
|
|
value: [0.9 0.84210526 0.85245902 0.96428571 0.87719298 0.90909091
|
|
0.87096774 0.87096774 0.86792453 0.93103448]
|
|
|
|
mean value: 0.8886028380315576
|
|
|
|
key: train_fscore
|
|
value: [0.92069632 0.92277992 0.92843327 0.91860465 0.91923077 0.92397661
|
|
0.91682785 0.92397661 0.91984733 0.92759295]
|
|
|
|
mean value: 0.9221966289590753
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.82758621 0.8125 1. 0.86206897 0.92592593
|
|
0.79411765 0.79411765 0.92 0.9 ]
|
|
|
|
mean value: 0.8680066392457366
|
|
|
|
key: train_precision
|
|
value: [0.90494297 0.90530303 0.90909091 0.90114068 0.89849624 0.91505792
|
|
0.90114068 0.91505792 0.89259259 0.92217899]
|
|
|
|
mean value: 0.9065001925631475
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.85714286 0.89655172 0.93103448 0.89285714 0.89285714
|
|
0.96428571 0.96428571 0.82142857 0.96428571]
|
|
|
|
mean value: 0.9149014778325123
|
|
|
|
key: train_recall
|
|
value: [0.93700787 0.94094488 0.9486166 0.93675889 0.94094488 0.93307087
|
|
0.93307087 0.93307087 0.9488189 0.93307087]
|
|
|
|
mean value: 0.9385375494071146
|
|
|
|
key: test_roc_auc
|
|
value: [0.89593596 0.84236453 0.841133 0.96551724 0.875 0.91071429
|
|
0.85714286 0.85714286 0.875 0.92857143]
|
|
|
|
mean value: 0.8848522167487685
|
|
|
|
key: train_roc_auc
|
|
value: [0.91909682 0.92106533 0.92706421 0.91719834 0.91732283 0.92322835
|
|
0.91535433 0.92322835 0.91732283 0.92716535]
|
|
|
|
mean value: 0.9208046746133018
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 0.72727273 0.74285714 0.93103448 0.78125 0.83333333
|
|
0.77142857 0.77142857 0.76666667 0.87096774]
|
|
|
|
mean value: 0.8014421055862936
|
|
|
|
key: train_jcc
|
|
value: [0.85304659 0.85663082 0.86642599 0.84946237 0.85053381 0.85869565
|
|
0.84642857 0.85869565 0.85159011 0.8649635 ]
|
|
|
|
mean value: 0.8556473070988301
|
|
|
|
MCC on Blind test: 0.68
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.07555914 2.03910518 1.99554539 1.15945792 2.01181102 2.08725309
|
|
2.18683028 2.63542819 2.05218887 2.24368691]
|
|
|
|
mean value: 2.0486865997314454
|
|
|
|
key: score_time
|
|
value: [0.01258802 0.0125947 0.01385498 0.01252556 0.02199531 0.01386309
|
|
0.0126183 0.05426383 0.01399326 0.01492596]
|
|
|
|
mean value: 0.018322300910949708
|
|
|
|
key: test_mcc
|
|
value: [0.76689254 0.82490815 0.85960591 1. 0.78772636 0.89342711
|
|
0.8660254 0.89802651 0.8660254 0.93094934]
|
|
|
|
mean value: 0.8693586722821179
|
|
|
|
key: train_mcc
|
|
value: [0.99606293 0.99606293 0.99606299 0.98425123 1. 0.99607071
|
|
1. 1. 0.99607071 0.99607071]
|
|
|
|
mean value: 0.9960652223608333
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.9122807 0.92982456 1. 0.89285714 0.94642857
|
|
0.92857143 0.94642857 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9326441102756893
|
|
|
|
key: train_accuracy
|
|
value: [0.99802761 0.99802761 0.99802761 0.99211045 1. 0.9980315
|
|
1. 1. 0.9980315 0.9980315 ]
|
|
|
|
mean value: 0.9980287782074578
|
|
|
|
key: test_fscore
|
|
value: [0.8852459 0.90909091 0.93103448 1. 0.89655172 0.94545455
|
|
0.93333333 0.94915254 0.92307692 0.96551724]
|
|
|
|
mean value: 0.9338457603243798
|
|
|
|
key: train_fscore
|
|
value: [0.99803536 0.99803536 0.99802761 0.99206349 1. 0.99803536
|
|
1. 1. 0.99803536 0.99803536]
|
|
|
|
mean value: 0.9980267922764523
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.92592593 0.93103448 1. 0.86666667 0.96296296
|
|
0.875 0.90322581 1. 0.93333333]
|
|
|
|
mean value: 0.921633099628094
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.99607843 0.99606299 0.99601594 1. 0.99607843
|
|
1. 1. 0.99607843 0.99607843]
|
|
|
|
mean value: 0.9972471085243709
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.89285714 0.93103448 1. 0.92857143 0.92857143
|
|
1. 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9502463054187192
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 0.98814229 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9988142292490119
|
|
|
|
key: test_roc_auc
|
|
value: [0.87869458 0.91194581 0.92980296 1. 0.89285714 0.94642857
|
|
0.92857143 0.94642857 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9327586206896552
|
|
|
|
key: train_roc_auc
|
|
value: [0.99802372 0.99802372 0.9980315 0.99210264 1. 0.9980315
|
|
1. 1. 0.9980315 0.9980315 ]
|
|
|
|
mean value: 0.998027605739006
|
|
|
|
key: test_jcc
|
|
value: [0.79411765 0.83333333 0.87096774 1. 0.8125 0.89655172
|
|
0.875 0.90322581 0.85714286 0.93333333]
|
|
|
|
mean value: 0.8776172443393375
|
|
|
|
key: train_jcc
|
|
value: [0.99607843 0.99607843 0.99606299 0.98425197 1. 0.99607843
|
|
1. 1. 0.99607843 0.99607843]
|
|
|
|
mean value: 0.9960707117492666
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06019211 0.02817774 0.03345776 0.03131986 0.03362656 0.03552794
|
|
0.03400302 0.03139901 0.03243303 0.03354049]
|
|
|
|
mean value: 0.035367751121521
|
|
|
|
key: score_time
|
|
value: [0.01245236 0.00913262 0.00972891 0.00909996 0.01145601 0.01146054
|
|
0.00974631 0.01045251 0.00957561 0.00912237]
|
|
|
|
mean value: 0.010222721099853515
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.79110556 0.85960591 0.92980296 0.70082556 1.
|
|
0.85933785 0.89802651 0.93094934 0.93094934]
|
|
|
|
mean value: 0.8832624260804833
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.89473684 0.92982456 0.96491228 0.83928571 1.
|
|
0.92857143 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9397243107769424
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.88888889 0.93103448 0.96551724 0.85714286 1.
|
|
0.93103448 0.94915254 0.96551724 0.96296296]
|
|
|
|
mean value: 0.9414213662606415
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.92307692 0.93103448 0.96551724 0.77142857 1.
|
|
0.9 0.90322581 0.93333333 1. ]
|
|
|
|
mean value: 0.9327616358428372
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.85714286 0.93103448 0.96551724 0.96428571 1.
|
|
0.96428571 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9539408866995074
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.89408867 0.92980296 0.96490148 0.83928571 1.
|
|
0.92857143 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9395935960591133
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.8 0.87096774 0.93333333 0.75 1.
|
|
0.87096774 0.90322581 0.93333333 0.92857143]
|
|
|
|
mean value: 0.8918970814132104
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12719536 0.13101172 0.12918282 0.12430263 0.12717915 0.12785411
|
|
0.12964702 0.13043404 0.13283753 0.13035512]
|
|
|
|
mean value: 0.12899994850158691
|
|
|
|
key: score_time
|
|
value: [0.01965737 0.01963568 0.01971865 0.0194478 0.01989031 0.01939631
|
|
0.01996708 0.02017188 0.0194571 0.01910806]
|
|
|
|
mean value: 0.019645023345947265
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.85960591 0.71921182 0.96551724 0.89342711 0.85714286
|
|
0.8660254 0.93094934 0.83484711 0.92857143]
|
|
|
|
mean value: 0.8785101179086021
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.92982456 0.85964912 0.98245614 0.94642857 0.92857143
|
|
0.92857143 0.96428571 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9379699248120301
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.92857143 0.86206897 0.98245614 0.94736842 0.92857143
|
|
0.93333333 0.96551724 0.90196078 0.96428571]
|
|
|
|
mean value: 0.9378419171661405
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.92857143 0.86206897 1. 0.93103448 0.92857143
|
|
0.875 0.93333333 1. 0.96428571]
|
|
|
|
mean value: 0.9387151067323481
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.86206897 0.96551724 0.96428571 0.92857143
|
|
1. 1. 0.82142857 0.96428571]
|
|
|
|
mean value: 0.9399014778325123
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.92980296 0.85960591 0.98275862 0.94642857 0.92857143
|
|
0.92857143 0.96428571 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9379926108374385
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.86666667 0.75757576 0.96551724 0.9 0.86666667
|
|
0.875 0.93333333 0.82142857 0.93103448]
|
|
|
|
mean value: 0.8848257202567548
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01180434 0.01166081 0.01208377 0.01183748 0.011832 0.01188374
|
|
0.01314521 0.0108161 0.01178122 0.01097274]
|
|
|
|
mean value: 0.011781740188598632
|
|
|
|
key: score_time
|
|
value: [0.00960636 0.00973678 0.0089736 0.00985813 0.00992084 0.00999618
|
|
0.00909567 0.00982451 0.0099051 0.00940228]
|
|
|
|
mean value: 0.009631943702697755
|
|
|
|
key: test_mcc
|
|
value: [0.50927421 0.54377353 0.59060008 0.7257422 0.5728919 0.42857143
|
|
0.75434227 0.64450339 0.39513166 0.57142857]
|
|
|
|
mean value: 0.5736259244504346
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.77192982 0.78947368 0.85964912 0.78571429 0.71428571
|
|
0.875 0.82142857 0.69642857 0.78571429]
|
|
|
|
mean value: 0.7854010025062657
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.74074074 0.76363636 0.8125 0.87096774 0.77777778 0.71428571
|
|
0.88135593 0.82758621 0.71186441 0.78571429]
|
|
|
|
mean value: 0.788642916996997
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.77777778 0.74285714 0.81818182 0.80769231 0.71428571
|
|
0.83870968 0.8 0.67741935 0.78571429]
|
|
|
|
mean value: 0.773186884799788
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.75 0.89655172 0.93103448 0.75 0.71428571
|
|
0.92857143 0.85714286 0.75 0.78571429]
|
|
|
|
mean value: 0.8077586206896552
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75369458 0.77155172 0.78756158 0.85837438 0.78571429 0.71428571
|
|
0.875 0.82142857 0.69642857 0.78571429]
|
|
|
|
mean value: 0.7849753694581281
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.58823529 0.61764706 0.68421053 0.77142857 0.63636364 0.55555556
|
|
0.78787879 0.70588235 0.55263158 0.64705882]
|
|
|
|
mean value: 0.6546892185901474
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.09243822 1.99595952 2.03226089 1.98863244 2.03522563 2.05493855
|
|
2.07914615 1.96964455 2.04330802 2.02386498]
|
|
|
|
mean value: 2.031541895866394
|
|
|
|
key: score_time
|
|
value: [0.10404348 0.10880017 0.10026383 0.09849286 0.1004591 0.10068846
|
|
0.10119557 0.10098815 0.10136032 0.09345913]
|
|
|
|
mean value: 0.10097510814666748
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8951918 0.89988258 1. 0.89342711 0.96490128
|
|
0.93094934 0.93094934 0.92857143 1. ]
|
|
|
|
mean value: 0.9443872875319015
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94736842 0.94736842 1. 0.94642857 0.98214286
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9716165413533835
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94545455 0.94545455 1. 0.94736842 0.98181818
|
|
0.96551724 0.96551724 0.96428571 1. ]
|
|
|
|
mean value: 0.9715415890824239
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 1. 1. 0.93103448 1.
|
|
0.93333333 0.93333333 0.96428571 1. ]
|
|
|
|
mean value: 0.9724949826673964
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.89655172 1. 0.96428571 0.96428571
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9717980295566503
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.94704433 0.94827586 1. 0.94642857 0.98214286
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9716748768472907
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.89655172 0.89655172 1. 0.9 0.96428571
|
|
0.93333333 0.93333333 0.93103448 1. ]
|
|
|
|
mean value: 0.9455090311986863
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.97834277 1.00785446 0.99240088 0.98267341 1.02293301 0.96102357
|
|
0.94727778 0.99126816 0.98970008 1.04643464]
|
|
|
|
mean value: 0.9919908761978149
|
|
|
|
key: score_time
|
|
value: [0.19152308 0.2427218 0.24201584 0.20301509 0.25864029 0.18198228
|
|
0.2727077 0.23134995 0.22634912 0.265769 ]
|
|
|
|
mean value: 0.23160741329193116
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8951918 0.9321832 1. 0.93094934 0.96490128
|
|
0.93094934 0.93094934 0.92857143 1. ]
|
|
|
|
mean value: 0.9513695720675288
|
|
|
|
key: train_mcc
|
|
value: [0.9685613 0.97645211 0.97645357 0.97245522 0.98032256 0.96862405
|
|
0.97250878 0.98032256 0.97250878 0.96862405]
|
|
|
|
mean value: 0.9736832978955833
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94736842 0.96491228 1. 0.96428571 0.98214286
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.97515664160401
|
|
|
|
key: train_accuracy
|
|
value: [0.98422091 0.98816568 0.98816568 0.98619329 0.99015748 0.98425197
|
|
0.98622047 0.99015748 0.98622047 0.98425197]
|
|
|
|
mean value: 0.9868005404649863
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94545455 0.96428571 1. 0.96551724 0.98181818
|
|
0.96551724 0.96551724 0.96428571 1. ]
|
|
|
|
mean value: 0.9752395879982088
|
|
|
|
key: train_fscore
|
|
value: [0.984375 0.98828125 0.98823529 0.98624754 0.99017682 0.984375
|
|
0.98630137 0.99017682 0.98630137 0.984375 ]
|
|
|
|
mean value: 0.98688454626256
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 1. 1. 0.93333333 1.
|
|
0.93333333 0.93333333 0.96428571 1. ]
|
|
|
|
mean value: 0.9727248677248678
|
|
|
|
key: train_precision
|
|
value: [0.97674419 0.98062016 0.98054475 0.98046875 0.98823529 0.97674419
|
|
0.98054475 0.98823529 0.98054475 0.97674419]
|
|
|
|
mean value: 0.9809426292658725
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.93103448 1. 1. 0.96428571
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9788177339901478
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.99606299 0.99604743 0.99209486 0.99212598 0.99212598
|
|
0.99212598 0.99212598 0.99212598 0.99212598]
|
|
|
|
mean value: 0.9929087174379883
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.94704433 0.96551724 1. 0.96428571 0.98214286
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9751847290640394
|
|
|
|
key: train_roc_auc
|
|
value: [0.98420528 0.98815007 0.9881812 0.98620491 0.99015748 0.98425197
|
|
0.98622047 0.99015748 0.98622047 0.98425197]
|
|
|
|
mean value: 0.9868001307148859
|
|
|
|
key: test_jcc
|
|
value: [1. 0.89655172 0.93103448 1. 0.93333333 0.96428571
|
|
0.93333333 0.93333333 0.93103448 1. ]
|
|
|
|
mean value: 0.9522906403940887
|
|
|
|
key: train_jcc
|
|
value: [0.96923077 0.97683398 0.97674419 0.97286822 0.98054475 0.96923077
|
|
0.97297297 0.98054475 0.97297297 0.96923077]
|
|
|
|
mean value: 0.9741174127736429
|
|
|
|
MCC on Blind test: 0.83
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0242126 0.01131368 0.01130939 0.01131654 0.01126266 0.01142788
|
|
0.01134253 0.01130843 0.01151514 0.01075053]
|
|
|
|
mean value: 0.012575936317443848
|
|
|
|
key: score_time
|
|
value: [0.00987244 0.00902605 0.00979638 0.0095818 0.00964236 0.00970507
|
|
0.00961256 0.00966287 0.00970745 0.00935006]
|
|
|
|
mean value: 0.009595704078674317
|
|
|
|
key: test_mcc
|
|
value: [0.58562417 0.62473685 0.50927421 0.57973205 0.60753044 0.71428571
|
|
0.64951905 0.72168784 0.67900461 0.39310793]
|
|
|
|
mean value: 0.6064502839116733
|
|
|
|
key: train_mcc
|
|
value: [0.63864108 0.67343572 0.67495523 0.65362362 0.63188315 0.6387663
|
|
0.65228602 0.64665231 0.67097829 0.6472967 ]
|
|
|
|
mean value: 0.6528518419173023
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.80701754 0.75438596 0.78947368 0.80357143 0.85714286
|
|
0.82142857 0.85714286 0.83928571 0.69642857]
|
|
|
|
mean value: 0.8015350877192983
|
|
|
|
key: train_accuracy
|
|
value: [0.81854043 0.83629191 0.83629191 0.82642998 0.81496063 0.81889764
|
|
0.82480315 0.82283465 0.83464567 0.82283465]
|
|
|
|
mean value: 0.8256530618583919
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.81967213 0.76666667 0.8 0.80701754 0.85714286
|
|
0.83333333 0.86666667 0.83636364 0.69090909]
|
|
|
|
mean value: 0.8077771926089441
|
|
|
|
key: train_fscore
|
|
value: [0.82509506 0.84069098 0.84250474 0.83011583 0.8219697 0.82375479
|
|
0.83239171 0.82758621 0.84030418 0.82889734]
|
|
|
|
mean value: 0.8313310537668297
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75757576 0.74193548 0.77419355 0.79310345 0.85714286
|
|
0.78125 0.8125 0.85185185 0.7037037 ]
|
|
|
|
mean value: 0.7823256650808097
|
|
|
|
key: train_precision
|
|
value: [0.79779412 0.82022472 0.81021898 0.81132075 0.7919708 0.80223881
|
|
0.79783394 0.80597015 0.8125 0.80147059]
|
|
|
|
mean value: 0.8051542850964287
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.89285714 0.79310345 0.82758621 0.82142857 0.85714286
|
|
0.89285714 0.92857143 0.82142857 0.67857143]
|
|
|
|
mean value: 0.8370689655172414
|
|
|
|
key: train_recall
|
|
value: [0.85433071 0.86220472 0.87747036 0.84980237 0.85433071 0.84645669
|
|
0.87007874 0.8503937 0.87007874 0.85826772]
|
|
|
|
mean value: 0.8593414459556192
|
|
|
|
key: test_roc_auc
|
|
value: [0.79064039 0.80849754 0.75369458 0.7887931 0.80357143 0.85714286
|
|
0.82142857 0.85714286 0.83928571 0.69642857]
|
|
|
|
mean value: 0.8016625615763546
|
|
|
|
key: train_roc_auc
|
|
value: [0.8184697 0.8362407 0.83637297 0.82647599 0.81496063 0.81889764
|
|
0.82480315 0.82283465 0.83464567 0.82283465]
|
|
|
|
mean value: 0.8256535744296786
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.69444444 0.62162162 0.66666667 0.67647059 0.75
|
|
0.71428571 0.76470588 0.71875 0.52777778]
|
|
|
|
mean value: 0.6801389362051127
|
|
|
|
key: train_jcc
|
|
value: [0.70226537 0.72516556 0.72786885 0.70957096 0.6977492 0.70032573
|
|
0.71290323 0.70588235 0.72459016 0.70779221]
|
|
|
|
mean value: 0.7114113624151682
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08987308 0.22393084 0.25971007 0.22161651 0.2970469 0.07107282
|
|
0.3908968 0.34707355 0.3672266 0.26015067]
|
|
|
|
mean value: 0.25285978317260743
|
|
|
|
key: score_time
|
|
value: [0.01140714 0.01223254 0.01125836 0.01227474 0.0113318 0.01112461
|
|
0.01194763 0.01308942 0.01288772 0.01307106]
|
|
|
|
mean value: 0.012062501907348634
|
|
|
|
key: test_mcc
|
|
value: [1. 0.82880708 0.96551724 0.96547546 0.89802651 1.
|
|
0.96490128 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.9518578190858389
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9122807 0.98245614 0.98245614 0.94642857 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.975219298245614
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.90566038 0.98245614 0.98305085 0.94915254 1.
|
|
0.98245614 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.9750749429620941
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96 1. 0.96666667 0.90322581 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9694260289210234
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 0.96551724 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9822660098522168
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.91133005 0.98275862 0.98214286 0.94642857 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9751231527093597
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.82758621 0.96551724 0.96666667 0.90322581 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9527363737486095
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0668292 0.06841993 0.12328172 0.05418634 0.05178666 0.0750885
|
|
0.09460568 0.09279466 0.08649731 0.06908274]
|
|
|
|
mean value: 0.07825727462768554
|
|
|
|
key: score_time
|
|
value: [0.01650906 0.02038956 0.02707815 0.01239896 0.01989675 0.01733708
|
|
0.01960182 0.02041531 0.02011204 0.01233983]
|
|
|
|
mean value: 0.01860785484313965
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.82512315 0.82490815 0.8951918 0.89342711 1.
|
|
0.85933785 0.82618439 1. 0.96490128]
|
|
|
|
mean value: 0.9021256935269077
|
|
|
|
key: train_mcc
|
|
value: [0.96055211 0.96844169 0.97239426 0.96844169 0.96850394 0.9645744
|
|
0.96850394 0.98425197 0.96850394 0.9645744 ]
|
|
|
|
mean value: 0.968874234820875
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.9122807 0.9122807 0.94736842 0.94642857 1.
|
|
0.92857143 0.91071429 1. 0.98214286]
|
|
|
|
mean value: 0.9504699248120301
|
|
|
|
key: train_accuracy
|
|
value: [0.98027613 0.98422091 0.98619329 0.98422091 0.98425197 0.98228346
|
|
0.98425197 0.99212598 0.98425197 0.98228346]
|
|
|
|
mean value: 0.9844360061501188
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.9122807 0.91525424 0.94915254 0.94736842 1.
|
|
0.93103448 0.91525424 1. 0.98181818]
|
|
|
|
mean value: 0.9517680045712283
|
|
|
|
key: train_fscore
|
|
value: [0.98031496 0.98425197 0.98619329 0.98418972 0.98425197 0.98224852
|
|
0.98425197 0.99212598 0.98425197 0.98231827]
|
|
|
|
mean value: 0.98443986279333
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.89655172 0.9 0.93333333 0.93103448 1.
|
|
0.9 0.87096774 1. 1. ]
|
|
|
|
mean value: 0.9365220615498703
|
|
|
|
key: train_precision
|
|
value: [0.98031496 0.98425197 0.98425197 0.98418972 0.98425197 0.98418972
|
|
0.98425197 0.99212598 0.98425197 0.98039216]
|
|
|
|
mean value: 0.9842472390904636
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.93103448 0.96551724 0.96428571 1.
|
|
0.96428571 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.9682266009852217
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98425197 0.98814229 0.98418972 0.98425197 0.98031496
|
|
0.98425197 0.99212598 0.98425197 0.98425197]
|
|
|
|
mean value: 0.9846347763841773
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.91256158 0.91194581 0.94704433 0.94642857 1.
|
|
0.92857143 0.91071429 1. 0.98214286]
|
|
|
|
mean value: 0.9504926108374385
|
|
|
|
key: train_roc_auc
|
|
value: [0.98027606 0.98422085 0.98619713 0.98422085 0.98425197 0.98228346
|
|
0.98425197 0.99212598 0.98425197 0.98228346]
|
|
|
|
mean value: 0.984436369860882
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.83870968 0.84375 0.90322581 0.9 1.
|
|
0.87096774 0.84375 1. 0.96428571]
|
|
|
|
mean value: 0.9098022273425499
|
|
|
|
key: train_jcc
|
|
value: [0.96138996 0.96899225 0.97276265 0.9688716 0.96899225 0.96511628
|
|
0.96899225 0.984375 0.96899225 0.96525097]
|
|
|
|
mean value: 0.9693735439203892
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02307963 0.01162243 0.01094699 0.01106811 0.01101923 0.01119828
|
|
0.01114917 0.01126027 0.01035953 0.01032233]
|
|
|
|
mean value: 0.012202596664428711
|
|
|
|
key: score_time
|
|
value: [0.01033926 0.01000857 0.00924134 0.00996804 0.01003981 0.01006603
|
|
0.00951672 0.00953388 0.00911236 0.00968027]
|
|
|
|
mean value: 0.009750628471374511
|
|
|
|
key: test_mcc
|
|
value: [0.52204981 0.68850906 0.57881773 0.64901478 0.50128041 0.64285714
|
|
0.57735027 0.65814518 0.64285714 0.53605627]
|
|
|
|
mean value: 0.599693779541295
|
|
|
|
key: train_mcc
|
|
value: [0.60210948 0.6702837 0.66589861 0.59833978 0.6189214 0.59993353
|
|
0.65074202 0.63496646 0.6918185 0.59961602]
|
|
|
|
mean value: 0.6332629515394039
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.84210526 0.78947368 0.8245614 0.75 0.82142857
|
|
0.78571429 0.82142857 0.82142857 0.76785714]
|
|
|
|
mean value: 0.7978383458646616
|
|
|
|
key: train_accuracy
|
|
value: [0.80078895 0.83431953 0.83234714 0.79881657 0.80905512 0.7992126
|
|
0.82480315 0.81692913 0.84448819 0.7992126 ]
|
|
|
|
mean value: 0.8159972976750687
|
|
|
|
key: test_fscore
|
|
value: [0.77419355 0.84745763 0.79310345 0.82758621 0.74074074 0.82142857
|
|
0.8 0.83870968 0.82142857 0.77192982]
|
|
|
|
mean value: 0.8036578216256797
|
|
|
|
key: train_fscore
|
|
value: [0.80539499 0.84030418 0.83685221 0.8030888 0.81381958 0.80608365
|
|
0.82982792 0.82217973 0.85122411 0.80534351]
|
|
|
|
mean value: 0.8214118676278633
|
|
|
|
key: test_precision
|
|
value: [0.70588235 0.80645161 0.79310345 0.82758621 0.76923077 0.82142857
|
|
0.75 0.76470588 0.82142857 0.75862069]
|
|
|
|
mean value: 0.7818438105112842
|
|
|
|
key: train_precision
|
|
value: [0.78867925 0.8125 0.81343284 0.78490566 0.79400749 0.77941176
|
|
0.80669145 0.79925651 0.81588448 0.78148148]
|
|
|
|
mean value: 0.7976250910229972
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.89285714 0.79310345 0.82758621 0.71428571 0.82142857
|
|
0.85714286 0.92857143 0.82142857 0.78571429]
|
|
|
|
mean value: 0.8299261083743842
|
|
|
|
key: train_recall
|
|
value: [0.82283465 0.87007874 0.86166008 0.82213439 0.83464567 0.83464567
|
|
0.85433071 0.84645669 0.88976378 0.83070866]
|
|
|
|
mean value: 0.8467259033332296
|
|
|
|
key: test_roc_auc
|
|
value: [0.75615764 0.8429803 0.78940887 0.82450739 0.75 0.82142857
|
|
0.78571429 0.82142857 0.82142857 0.76785714]
|
|
|
|
mean value: 0.7980911330049261
|
|
|
|
key: train_roc_auc
|
|
value: [0.80074539 0.83424886 0.83240484 0.79886247 0.80905512 0.7992126
|
|
0.82480315 0.81692913 0.84448819 0.7992126 ]
|
|
|
|
mean value: 0.8159962341663813
|
|
|
|
key: test_jcc
|
|
value: [0.63157895 0.73529412 0.65714286 0.70588235 0.58823529 0.6969697
|
|
0.66666667 0.72222222 0.6969697 0.62857143]
|
|
|
|
mean value: 0.6729533280616872
|
|
|
|
key: train_jcc
|
|
value: [0.67419355 0.72459016 0.71947195 0.67096774 0.68608414 0.67515924
|
|
0.70915033 0.69805195 0.74098361 0.67412141]
|
|
|
|
mean value: 0.6972774066672848
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02162695 0.02330804 0.02636027 0.02295399 0.0236361 0.06173849
|
|
0.02118683 0.02854466 0.02622795 0.0239768 ]
|
|
|
|
mean value: 0.027956008911132812
|
|
|
|
key: score_time
|
|
value: [0.01124668 0.01192689 0.01218081 0.01735902 0.01788497 0.0120585
|
|
0.01298404 0.01233673 0.0231607 0.01238585]
|
|
|
|
mean value: 0.0143524169921875
|
|
|
|
key: test_mcc
|
|
value: [0.86189955 0.82880708 0.86189955 0.89988258 0.79385662 1.
|
|
0.89802651 0.89342711 0.80439967 0.93094934]
|
|
|
|
mean value: 0.8773147998171975
|
|
|
|
key: train_mcc
|
|
value: [0.97636129 0.96450468 0.97239426 0.91875999 0.93470218 0.95322883
|
|
0.94970991 0.97649905 0.77972956 0.97250878]
|
|
|
|
mean value: 0.9398398525421525
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.9122807 0.92982456 0.94736842 0.89285714 1.
|
|
0.94642857 0.94642857 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9362155388471178
|
|
|
|
key: train_accuracy
|
|
value: [0.98816568 0.98224852 0.98619329 0.95857988 0.96653543 0.97637795
|
|
0.97440945 0.98818898 0.87992126 0.98622047]
|
|
|
|
mean value: 0.9686840920032925
|
|
|
|
key: test_fscore
|
|
value: [0.93103448 0.90566038 0.92857143 0.94545455 0.9 1.
|
|
0.94915254 0.94736842 0.88 0.96551724]
|
|
|
|
mean value: 0.9352759038947909
|
|
|
|
key: train_fscore
|
|
value: [0.98823529 0.98224852 0.98619329 0.95723014 0.96749522 0.97674419
|
|
0.97495183 0.98809524 0.86474501 0.98630137]
|
|
|
|
mean value: 0.9672240106699174
|
|
|
|
key: test_precision
|
|
value: [0.9 0.96 0.96296296 1. 0.84375 1.
|
|
0.90322581 0.93103448 1. 0.93333333]
|
|
|
|
mean value: 0.943430658550653
|
|
|
|
key: train_precision
|
|
value: [0.984375 0.98418972 0.98425197 0.98739496 0.94052045 0.96183206
|
|
0.95471698 0.996 0.98984772 0.98054475]
|
|
|
|
mean value: 0.9763673600922473
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.85714286 0.89655172 0.89655172 0.96428571 1.
|
|
1. 0.96428571 0.78571429 1. ]
|
|
|
|
mean value: 0.9328817733990148
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.98031496 0.98814229 0.92885375 0.99606299 0.99212598
|
|
0.99606299 0.98031496 0.76771654 0.99212598]
|
|
|
|
mean value: 0.9613846441131617
|
|
|
|
key: test_roc_auc
|
|
value: [0.93041872 0.91133005 0.93041872 0.94827586 0.89285714 1.
|
|
0.94642857 0.94642857 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9363300492610838
|
|
|
|
key: train_roc_auc
|
|
value: [0.98815785 0.98225234 0.98619713 0.95852137 0.96653543 0.97637795
|
|
0.97440945 0.98818898 0.87992126 0.98622047]
|
|
|
|
mean value: 0.9686782235224549
|
|
|
|
key: test_jcc
|
|
value: [0.87096774 0.82758621 0.86666667 0.89655172 0.81818182 1.
|
|
0.90322581 0.9 0.78571429 0.93333333]
|
|
|
|
mean value: 0.8802227583317683
|
|
|
|
key: train_jcc
|
|
value: [0.97674419 0.96511628 0.97276265 0.91796875 0.93703704 0.95454545
|
|
0.95112782 0.97647059 0.76171875 0.97297297]
|
|
|
|
mean value: 0.9386464483370307
|
|
|
|
MCC on Blind test: 0.74
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01865149 0.02236271 0.02846265 0.02018642 0.02099609 0.01799417
|
|
0.01969862 0.01990604 0.02230525 0.02031541]
|
|
|
|
mean value: 0.0210878849029541
|
|
|
|
key: score_time
|
|
value: [0.02322531 0.0123446 0.01212907 0.01237273 0.01238465 0.01233506
|
|
0.01233649 0.01211786 0.01209664 0.02831006]
|
|
|
|
mean value: 0.014965248107910157
|
|
|
|
key: test_mcc
|
|
value: [0.56067321 0.82880708 0.66755025 0.93202124 0.71428571 0.74535599
|
|
0.60485838 0.85714286 0.96490128 0.93094934]
|
|
|
|
mean value: 0.7806545352178498
|
|
|
|
key: train_mcc
|
|
value: [0.54270333 0.98028353 0.80481374 0.9417201 0.97244848 0.75996798
|
|
0.81248429 0.92727605 0.94217971 0.95349515]
|
|
|
|
mean value: 0.8637372357336665
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.9122807 0.80701754 0.96491228 0.85714286 0.85714286
|
|
0.76785714 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.8778195488721804
|
|
|
|
key: train_accuracy
|
|
value: [0.72781065 0.99013807 0.89546351 0.9704142 0.98622047 0.86811024
|
|
0.8976378 0.96259843 0.97047244 0.97637795]
|
|
|
|
mean value: 0.9245243752814922
|
|
|
|
key: test_fscore
|
|
value: [0.78873239 0.90566038 0.76595745 0.96666667 0.85714286 0.83333333
|
|
0.8115942 0.92857143 0.98245614 0.96551724]
|
|
|
|
mean value: 0.8805632088876222
|
|
|
|
key: train_fscore
|
|
value: [0.78637771 0.99017682 0.88453159 0.97098646 0.98624754 0.8494382
|
|
0.90714286 0.96130346 0.97120921 0.97683398]
|
|
|
|
mean value: 0.9284247832831198
|
|
|
|
key: test_precision
|
|
value: [0.65116279 0.96 1. 0.93548387 0.85714286 1.
|
|
0.68292683 0.92857143 0.96551724 0.93333333]
|
|
|
|
mean value: 0.8914138351360639
|
|
|
|
key: train_precision
|
|
value: [0.64795918 0.98823529 0.98543689 0.95075758 0.98431373 0.9895288
|
|
0.83006536 0.99578059 0.94756554 0.95833333]
|
|
|
|
mean value: 0.9277976294653208
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 0.62068966 1. 0.85714286 0.71428571
|
|
1. 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.8977832512315271
|
|
|
|
key: train_recall
|
|
value: [1. 0.99212598 0.80237154 0.99209486 0.98818898 0.74409449
|
|
1. 0.92913386 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9440135694500638
|
|
|
|
key: test_roc_auc
|
|
value: [0.74137931 0.91133005 0.81034483 0.96428571 0.85714286 0.85714286
|
|
0.76785714 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.878448275862069
|
|
|
|
key: train_roc_auc
|
|
value: [0.72727273 0.99013414 0.89528026 0.97045688 0.98622047 0.86811024
|
|
0.8976378 0.96259843 0.97047244 0.97637795]
|
|
|
|
mean value: 0.9244561327067319
|
|
|
|
key: test_jcc
|
|
value: [0.65116279 0.82758621 0.62068966 0.93548387 0.75 0.71428571
|
|
0.68292683 0.86666667 0.96551724 0.93333333]
|
|
|
|
mean value: 0.79476523086677
|
|
|
|
key: train_jcc
|
|
value: [0.64795918 0.98054475 0.79296875 0.94360902 0.97286822 0.73828125
|
|
0.83006536 0.9254902 0.94402985 0.95471698]
|
|
|
|
mean value: 0.8730533557799736
|
|
|
|
MCC on Blind test: 0.62
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25352716 0.23959851 0.23895645 0.23683834 0.23459578 0.23570085
|
|
0.23769593 0.24037766 0.24266458 0.23892522]
|
|
|
|
mean value: 0.23988804817199708
|
|
|
|
key: score_time
|
|
value: [0.01613116 0.01571035 0.01589203 0.01570082 0.01559377 0.01582003
|
|
0.01564693 0.01657271 0.01562572 0.01605487]
|
|
|
|
mean value: 0.015874838829040526
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.82880708 0.9321832 1. 0.89802651 1.
|
|
0.92857143 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.944891429609529
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.9122807 0.96491228 1. 0.94642857 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9716791979949875
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.90566038 0.96428571 1. 0.94915254 1.
|
|
0.96428571 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.971317591185117
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96 1. 1. 0.90322581 1.
|
|
0.96428571 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9726362095449971
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.85714286 0.93103448 1. 1. 1.
|
|
0.96428571 1. 1. 1. ]
|
|
|
|
mean value: 0.9716748768472906
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.91133005 0.96551724 1. 0.94642857 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9716133004926109
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.82758621 0.93103448 1. 0.90322581 1.
|
|
0.93103448 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9456017267863764
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06545424 0.09734583 0.08027411 0.07696271 0.09823036 0.08253455
|
|
0.10306764 0.08882976 0.0632627 0.07575917]
|
|
|
|
mean value: 0.08317210674285888
|
|
|
|
key: score_time
|
|
value: [0.01944017 0.03101039 0.03404474 0.02634931 0.02568722 0.02294087
|
|
0.02933931 0.02022123 0.02197647 0.0271163 ]
|
|
|
|
mean value: 0.025812602043151854
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.79110556 0.89988258 1. 0.79385662 1.
|
|
0.85714286 0.93094934 0.93094934 1. ]
|
|
|
|
mean value: 0.9169361747244729
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99211042 0.98817342 0.98425123 1. 0.99607071
|
|
0.99607071 0.98819663 0.99607071 0.98819663]
|
|
|
|
mean value: 0.9929140477452736
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.89473684 0.94736842 1. 0.89285714 1.
|
|
0.92857143 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9574561403508772
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99605523 0.99408284 0.99211045 1. 0.9980315
|
|
0.9980315 0.99409449 0.9980315 0.99409449]
|
|
|
|
mean value: 0.9964531985276989
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.88888889 0.94545455 1. 0.9 1.
|
|
0.92857143 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9575767527491665
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99606299 0.99408284 0.99206349 1. 0.99802761
|
|
0.99803536 0.99410609 0.99803536 0.99410609]
|
|
|
|
mean value: 0.9964519845500475
|
|
|
|
key: test_precision
|
|
value: [1. 0.92307692 1. 1. 0.84375 1.
|
|
0.92857143 0.93333333 0.93333333 1. ]
|
|
|
|
mean value: 0.9562065018315018
|
|
|
|
key: train_precision
|
|
value: [1. 0.99606299 0.99212598 0.99601594 1. 1.
|
|
0.99607843 0.99215686 0.99607843 0.99215686]
|
|
|
|
mean value: 0.9960675500868227
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.85714286 0.89655172 1. 0.96428571 1.
|
|
0.92857143 1. 1. 1. ]
|
|
|
|
mean value: 0.9610837438423645
|
|
|
|
key: train_recall
|
|
value: [1. 0.99606299 0.99604743 0.98814229 1. 0.99606299
|
|
1. 0.99606299 1. 0.99606299]
|
|
|
|
mean value: 0.9968441691824095
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.89408867 0.94827586 1. 0.89285714 1.
|
|
0.92857143 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9574507389162562
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99605521 0.99408671 0.99210264 1. 0.9980315
|
|
0.9980315 0.99409449 0.9980315 0.99409449]
|
|
|
|
mean value: 0.9964528025893996
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.8 0.89655172 1. 0.81818182 1.
|
|
0.86666667 0.93333333 0.93333333 1. ]
|
|
|
|
mean value: 0.9212352589938797
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99215686 0.98823529 0.98425197 1. 0.99606299
|
|
0.99607843 0.98828125 0.99607843 0.98828125]
|
|
|
|
mean value: 0.9929426480237764
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.24165225 0.2639544 0.21907234 0.1663208 0.1660831 0.18476343
|
|
0.18354225 0.15284801 0.23790598 0.2857213 ]
|
|
|
|
mean value: 0.21018638610839843
|
|
|
|
key: score_time
|
|
value: [0.02633309 0.04548001 0.03008986 0.05320883 0.0402391 0.03054214
|
|
0.01558995 0.02765036 0.02605081 0.04492259]
|
|
|
|
mean value: 0.034010672569274904
|
|
|
|
key: test_mcc
|
|
value: [0.5149026 0.78940887 0.51851399 0.79778885 0.75047877 0.78571429
|
|
0.61706091 0.89802651 0.79385662 0.75047877]
|
|
|
|
mean value: 0.7216230189409703
|
|
|
|
key: train_mcc
|
|
value: [0.99606293 0.98817342 0.98817323 0.98425123 0.98819663 0.99607071
|
|
0.99212598 0.98819663 0.99212598 0.98819663]
|
|
|
|
mean value: 0.9901573396651306
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.89473684 0.75438596 0.89473684 0.875 0.89285714
|
|
0.80357143 0.94642857 0.89285714 0.875 ]
|
|
|
|
mean value: 0.8583959899749374
|
|
|
|
key: train_accuracy
|
|
value: [0.99802761 0.99408284 0.99408284 0.99211045 0.99409449 0.9980315
|
|
0.99606299 0.99409449 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9950744692416407
|
|
|
|
key: test_fscore
|
|
value: [0.76666667 0.89285714 0.78125 0.88888889 0.87719298 0.89285714
|
|
0.81967213 0.94915254 0.88461538 0.87272727]
|
|
|
|
mean value: 0.8625880154589062
|
|
|
|
key: train_fscore
|
|
value: [0.99803536 0.99408284 0.99405941 0.99206349 0.99408284 0.99803536
|
|
0.99606299 0.99408284 0.99606299 0.99408284]
|
|
|
|
mean value: 0.9950650970118321
|
|
|
|
key: test_precision
|
|
value: [0.71875 0.89285714 0.71428571 0.96 0.86206897 0.89285714
|
|
0.75757576 0.90322581 0.95833333 0.88888889]
|
|
|
|
mean value: 0.8548842751766834
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.99604743 0.99603175 0.99601594 0.99604743 0.99607843
|
|
0.99606299 0.99604743 0.99606299 0.99604743]
|
|
|
|
mean value: 0.996052025260395
|
|
|
|
key: test_recall
|
|
value: [0.82142857 0.89285714 0.86206897 0.82758621 0.89285714 0.89285714
|
|
0.89285714 1. 0.82142857 0.85714286]
|
|
|
|
mean value: 0.8761083743842365
|
|
|
|
key: train_recall
|
|
value: [1. 0.99212598 0.99209486 0.98814229 0.99212598 1.
|
|
0.99606299 0.99212598 0.99606299 0.99212598]
|
|
|
|
mean value: 0.994086707541004
|
|
|
|
key: test_roc_auc
|
|
value: [0.75554187 0.89470443 0.75246305 0.89593596 0.875 0.89285714
|
|
0.80357143 0.94642857 0.89285714 0.875 ]
|
|
|
|
mean value: 0.858435960591133
|
|
|
|
key: train_roc_auc
|
|
value: [0.99802372 0.99408671 0.99407893 0.99210264 0.99409449 0.9980315
|
|
0.99606299 0.99409449 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9950732937038996
|
|
|
|
key: test_jcc
|
|
value: [0.62162162 0.80645161 0.64102564 0.8 0.78125 0.80645161
|
|
0.69444444 0.90322581 0.79310345 0.77419355]
|
|
|
|
mean value: 0.762176773601273
|
|
|
|
key: train_jcc
|
|
value: [0.99607843 0.98823529 0.98818898 0.98425197 0.98823529 0.99607843
|
|
0.99215686 0.98823529 0.99215686 0.98823529]
|
|
|
|
mean value: 0.9901852709587773
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.97148848 0.95016146 0.97649026 0.94490004 0.95638227 0.95632815
|
|
0.95845008 0.95048833 0.95981479 0.96253824]
|
|
|
|
mean value: 0.9587042093276977
|
|
|
|
key: score_time
|
|
value: [0.00969386 0.00933099 0.00937676 0.00944138 0.00976372 0.00949907
|
|
0.00940275 0.00956464 0.009552 0.00934672]
|
|
|
|
mean value: 0.00949718952178955
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.82880708 0.9321832 1. 0.8660254 1.
|
|
0.89342711 0.93094934 0.96490128 0.96490128]
|
|
|
|
mean value: 0.9313215939799506
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.9122807 0.96491228 1. 0.92857143 1.
|
|
0.94642857 0.96428571 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9645676691729324
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.90566038 0.96428571 1. 0.93333333 1.
|
|
0.94545455 0.96551724 0.98245614 0.98181818]
|
|
|
|
mean value: 0.9641488496943416
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96 1. 1. 0.875 1.
|
|
0.96296296 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9696813537675607
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.85714286 0.93103448 1. 1. 1.
|
|
0.92857143 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9609605911330049
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.91133005 0.96551724 1. 0.92857143 1.
|
|
0.94642857 0.96428571 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9644704433497537
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.82758621 0.93103448 1. 0.875 1.
|
|
0.89655172 0.93333333 0.96551724 0.96428571]
|
|
|
|
mean value: 0.932188013136289
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03023362 0.03938198 0.03128219 0.03190351 0.03164005 0.03126192
|
|
0.03192115 0.03110099 0.03161693 0.0320549 ]
|
|
|
|
mean value: 0.03223972320556641
|
|
|
|
key: score_time
|
|
value: [0.01244426 0.01752877 0.01386118 0.01407957 0.01385164 0.01395226
|
|
0.01401639 0.01404047 0.01423621 0.01403999]
|
|
|
|
mean value: 0.014205074310302735
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.8951918 0.93202124 1. 0.96490128 0.93094934
|
|
0.96490128 0.96490128 0.96490128 1. ]
|
|
|
|
mean value: 0.9512959308288262
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.96491228 1. 0.98214286 0.96428571
|
|
0.98214286 0.98214286 0.98214286 1. ]
|
|
|
|
mean value: 0.975250626566416
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.94545455 0.96666667 1. 0.98245614 0.96551724
|
|
0.98245614 0.98245614 0.98245614 1. ]
|
|
|
|
mean value: 0.9752917560358576
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.96296296 0.93548387 1. 0.96551724 0.93333333
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9656812095744243
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.92857143 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.94704433 0.96428571 1. 0.98214286 0.96428571
|
|
0.98214286 0.98214286 0.98214286 1. ]
|
|
|
|
mean value: 0.9751231527093597
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.89655172 0.93548387 1. 0.96551724 0.93333333
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9523989618094179
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.1
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02729225 0.03505087 0.03882575 0.04039145 0.03900075 0.03893924
|
|
0.03912377 0.03932238 0.03904605 0.03894711]
|
|
|
|
mean value: 0.037593960762023926
|
|
|
|
key: score_time
|
|
value: [0.01924682 0.02053595 0.01905107 0.0189836 0.01907921 0.01899457
|
|
0.01899743 0.01894832 0.01898527 0.01905417]
|
|
|
|
mean value: 0.01918764114379883
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.8951918 0.92980296 1. 0.85933785 1.
|
|
0.85933785 0.85933785 0.96490128 0.93094934]
|
|
|
|
mean value: 0.9231042121808317
|
|
|
|
key: train_mcc
|
|
value: [0.96450413 0.97239383 0.96847232 0.96847232 0.97244848 0.9645744
|
|
0.97244848 0.97637795 0.9645744 0.9645744 ]
|
|
|
|
mean value: 0.9688840740520619
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.96491228 1. 0.92857143 1.
|
|
0.92857143 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9609335839598998
|
|
|
|
key: train_accuracy
|
|
value: [0.98224852 0.98619329 0.98422091 0.98422091 0.98622047 0.98228346
|
|
0.98622047 0.98818898 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9844363944151951
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.94545455 0.96551724 1. 0.93103448 1.
|
|
0.93103448 0.93103448 0.98181818 0.96551724]
|
|
|
|
mean value: 0.961692789968652
|
|
|
|
key: train_fscore
|
|
value: [0.98231827 0.98624754 0.98425197 0.98425197 0.98624754 0.98231827
|
|
0.98624754 0.98818898 0.98231827 0.98231827]
|
|
|
|
mean value: 0.9844708630478165
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.96296296 0.96551724 1. 0.9 1.
|
|
0.9 0.9 1. 0.93333333]
|
|
|
|
mean value: 0.949514687100894
|
|
|
|
key: train_precision
|
|
value: [0.98039216 0.98431373 0.98039216 0.98039216 0.98431373 0.98039216
|
|
0.98431373 0.98818898 0.98039216 0.98039216]
|
|
|
|
mean value: 0.9823483094025012
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.96551724 1. 0.96428571 1.
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9751231527093596
|
|
|
|
key: train_recall
|
|
value: [0.98425197 0.98818898 0.98814229 0.98814229 0.98818898 0.98425197
|
|
0.98818898 0.98818898 0.98425197 0.98425197]
|
|
|
|
mean value: 0.9866048364507797
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.94704433 0.96490148 1. 0.92857143 1.
|
|
0.92857143 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [0.98224456 0.98618935 0.98422863 0.98422863 0.98622047 0.98228346
|
|
0.98622047 0.98818898 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9844371479256793
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.89655172 0.93333333 1. 0.87096774 1.
|
|
0.87096774 0.87096774 0.96428571 0.93333333]
|
|
|
|
mean value: 0.9273740664230097
|
|
|
|
key: train_jcc
|
|
value: [0.96525097 0.97286822 0.96899225 0.96899225 0.97286822 0.96525097
|
|
0.97286822 0.9766537 0.96525097 0.96525097]
|
|
|
|
mean value: 0.9694246704788737
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.37074566 0.28954816 0.30450511 0.2897861 0.29390836 0.33179474
|
|
0.34224701 0.31382322 0.29044604 0.29306006]
|
|
|
|
mean value: 0.31198644638061523
|
|
|
|
key: score_time
|
|
value: [0.01924872 0.01909328 0.01921964 0.01943707 0.01910305 0.01929712
|
|
0.01911283 0.01910567 0.01906586 0.01909971]
|
|
|
|
mean value: 0.019178295135498048
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.8951918 0.92980296 1. 0.85933785 1.
|
|
0.85933785 0.85933785 0.96490128 0.93094934]
|
|
|
|
mean value: 0.9231042121808317
|
|
|
|
key: train_mcc
|
|
value: [0.96450413 0.97239383 0.96847232 0.96847232 0.97244848 0.9645744
|
|
0.97244848 0.97637795 0.9645744 0.9645744 ]
|
|
|
|
mean value: 0.9688840740520619
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.96491228 1. 0.92857143 1.
|
|
0.92857143 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9609335839598998
|
|
|
|
key: train_accuracy
|
|
value: [0.98224852 0.98619329 0.98422091 0.98422091 0.98622047 0.98228346
|
|
0.98622047 0.98818898 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9844363944151951
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.94545455 0.96551724 1. 0.93103448 1.
|
|
0.93103448 0.93103448 0.98181818 0.96551724]
|
|
|
|
mean value: 0.961692789968652
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:128: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:131: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.98231827 0.98624754 0.98425197 0.98425197 0.98624754 0.98231827
|
|
0.98624754 0.98818898 0.98231827 0.98231827]
|
|
|
|
mean value: 0.9844708630478165
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.96296296 0.96551724 1. 0.9 1.
|
|
0.9 0.9 1. 0.93333333]
|
|
|
|
mean value: 0.949514687100894
|
|
|
|
key: train_precision
|
|
value: [0.98039216 0.98431373 0.98039216 0.98039216 0.98431373 0.98039216
|
|
0.98431373 0.98818898 0.98039216 0.98039216]
|
|
|
|
mean value: 0.9823483094025012
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.96551724 1. 0.96428571 1.
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9751231527093596
|
|
|
|
key: train_recall
|
|
value: [0.98425197 0.98818898 0.98814229 0.98814229 0.98818898 0.98425197
|
|
0.98818898 0.98818898 0.98425197 0.98425197]
|
|
|
|
mean value: 0.9866048364507797
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.94704433 0.96490148 1. 0.92857143 1.
|
|
0.92857143 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [0.98224456 0.98618935 0.98422863 0.98422863 0.98622047 0.98228346
|
|
0.98622047 0.98818898 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9844371479256793
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.89655172 0.93333333 1. 0.87096774 1.
|
|
0.87096774 0.87096774 0.96428571 0.93333333]
|
|
|
|
mean value: 0.9273740664230097
|
|
|
|
key: train_jcc
|
|
value: [0.96525097 0.97286822 0.96899225 0.96899225 0.97286822 0.96525097
|
|
0.97286822 0.9766537 0.96525097 0.96525097]
|
|
|
|
mean value: 0.9694246704788737
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03503633 0.04945874 0.03754044 0.03722906 0.03705621 0.03847361
|
|
0.03858423 0.03832555 0.03757358 0.03789806]
|
|
|
|
mean value: 0.03871757984161377
|
|
|
|
key: score_time
|
|
value: [0.01331329 0.01399422 0.01336384 0.01483345 0.01530671 0.01369762
|
|
0.01375341 0.01342893 0.01370549 0.01364231]
|
|
|
|
mean value: 0.013903927803039551
|
|
|
|
key: test_mcc
|
|
value: [0.83797038 0.82512315 0.82880708 0.9321832 0.71611487 0.89342711
|
|
0.79385662 0.85933785 0.71428571 0.96490128]
|
|
|
|
mean value: 0.8366007266767884
|
|
|
|
key: train_mcc
|
|
value: [0.90933143 0.9172256 0.89754406 0.90144111 0.91732994 0.90945587
|
|
0.90158179 0.91738682 0.90951226 0.89766562]
|
|
|
|
mean value: 0.9078474512102387
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.9122807 0.9122807 0.96491228 0.85714286 0.94642857
|
|
0.89285714 0.92857143 0.85714286 0.98214286]
|
|
|
|
mean value: 0.9166040100250626
|
|
|
|
key: train_accuracy
|
|
value: [0.95463511 0.95857988 0.94871795 0.95069034 0.95866142 0.95472441
|
|
0.9507874 0.95866142 0.95472441 0.9488189 ]
|
|
|
|
mean value: 0.953900122691764
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.9122807 0.91803279 0.96428571 0.86206897 0.94545455
|
|
0.9 0.93103448 0.85714286 0.98245614]
|
|
|
|
mean value: 0.9190788981034733
|
|
|
|
key: train_fscore
|
|
value: [0.95499022 0.95841584 0.94820717 0.95029821 0.95857988 0.95463511
|
|
0.95069034 0.95841584 0.95499022 0.9486166 ]
|
|
|
|
mean value: 0.9537839421981321
|
|
|
|
key: test_precision
|
|
value: [0.84848485 0.89655172 0.875 1. 0.83333333 0.96296296
|
|
0.84375 0.9 0.85714286 0.96551724]
|
|
|
|
mean value: 0.8982742967441243
|
|
|
|
key: train_precision
|
|
value: [0.94941634 0.96414343 0.95582329 0.956 0.96047431 0.95652174
|
|
0.95256917 0.96414343 0.94941634 0.95238095]
|
|
|
|
mean value: 0.9560889000359492
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.96551724 0.93103448 0.89285714 0.92857143
|
|
0.96428571 0.96428571 0.85714286 1. ]
|
|
|
|
mean value: 0.9432266009852217
|
|
|
|
key: train_recall
|
|
value: [0.96062992 0.95275591 0.94071146 0.94466403 0.95669291 0.95275591
|
|
0.9488189 0.95275591 0.96062992 0.94488189]
|
|
|
|
mean value: 0.9515296753913666
|
|
|
|
key: test_roc_auc
|
|
value: [0.9137931 0.91256158 0.91133005 0.96551724 0.85714286 0.94642857
|
|
0.89285714 0.92857143 0.85714286 0.98214286]
|
|
|
|
mean value: 0.9167487684729064
|
|
|
|
key: train_roc_auc
|
|
value: [0.95462326 0.95859139 0.94870219 0.95067847 0.95866142 0.95472441
|
|
0.9507874 0.95866142 0.95472441 0.9488189 ]
|
|
|
|
mean value: 0.9538973265693567
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.83870968 0.84848485 0.93103448 0.75757576 0.89655172
|
|
0.81818182 0.87096774 0.75 0.96551724]
|
|
|
|
mean value: 0.8525508140357974
|
|
|
|
key: train_jcc
|
|
value: [0.91385768 0.92015209 0.90151515 0.90530303 0.92045455 0.91320755
|
|
0.90601504 0.92015209 0.91385768 0.90225564]
|
|
|
|
mean value: 0.9116770489449016
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.06151056 0.97917652 1.15594339 1.03205562 1.13682246 1.09652424
|
|
1.14049935 0.84171343 1.05300856 0.92840409]
|
|
|
|
mean value: 1.0425658226013184
|
|
|
|
key: score_time
|
|
value: [0.01576757 0.01351643 0.01401377 0.01967168 0.01395941 0.02089858
|
|
0.01919866 0.01392365 0.01353335 0.01369238]
|
|
|
|
mean value: 0.015817546844482423
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.82512315 0.93202124 0.93202124 0.82618439 0.93094934
|
|
0.89802651 0.89802651 0.89342711 0.96490128]
|
|
|
|
mean value: 0.900056335305218
|
|
|
|
key: train_mcc
|
|
value: [0.99211042 0.98028384 0.98817342 0.99211042 1. 0.98819663
|
|
1. 1. 0.99212598 0.99212598]
|
|
|
|
mean value: 0.9925126702482893
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.96491228 0.96491228 0.91071429 0.96428571
|
|
0.94642857 0.94642857 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9485902255639097
|
|
|
|
key: train_accuracy
|
|
value: [0.99605523 0.99013807 0.99408284 0.99605523 1. 0.99409449
|
|
1. 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9962551833387691
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.9122807 0.96666667 0.96666667 0.91525424 0.96296296
|
|
0.94915254 0.94915254 0.94736842 0.98245614]
|
|
|
|
mean value: 0.9501113423860971
|
|
|
|
key: train_fscore
|
|
value: [0.99606299 0.99013807 0.99408284 0.99604743 1. 0.99410609
|
|
1. 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9962563404879103
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.89655172 0.93548387 0.93548387 0.87096774 1.
|
|
0.90322581 0.90322581 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9244716351501668
|
|
|
|
key: train_precision
|
|
value: [0.99606299 0.99209486 0.99212598 0.99604743 1. 0.99215686
|
|
1. 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9960614115865138
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9785714285714285
|
|
|
|
key: train_recall
|
|
value: [0.99606299 0.98818898 0.99604743 0.99604743 1. 0.99606299
|
|
1. 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9964535806541969
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.91256158 0.96428571 0.96428571 0.91071429 0.96428571
|
|
0.94642857 0.94642857 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9485837438423645
|
|
|
|
key: train_roc_auc
|
|
value: [0.99605521 0.99014192 0.99408671 0.99605521 1. 0.99409449
|
|
1. 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9962559521956988
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.83870968 0.93548387 0.93548387 0.84375 0.92857143
|
|
0.90322581 0.90322581 0.9 0.96551724]
|
|
|
|
mean value: 0.9057193508660416
|
|
|
|
key: train_jcc
|
|
value: [0.99215686 0.98046875 0.98823529 0.99212598 1. 0.98828125
|
|
1. 1. 0.99215686 0.99215686]
|
|
|
|
mean value: 0.992558186660491
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01462245 0.01250768 0.01081085 0.01020527 0.01017833 0.01032519
|
|
0.01021409 0.01031613 0.01019001 0.01003003]
|
|
|
|
mean value: 0.010940003395080566
|
|
|
|
key: score_time
|
|
value: [0.01250744 0.00965738 0.00924444 0.00913548 0.00906038 0.00908661
|
|
0.00905704 0.00912213 0.00909281 0.00910807]
|
|
|
|
mean value: 0.009507179260253906
|
|
|
|
key: test_mcc
|
|
value: [0.59358067 0.6166424 0.75492611 0.58562417 0.53881591 0.71611487
|
|
0.65814518 0.5728919 0.72168784 0.78571429]
|
|
|
|
mean value: 0.6544143324803186
|
|
|
|
key: train_mcc
|
|
value: [0.73570695 0.73999638 0.7556462 0.70845665 0.69724436 0.72930229
|
|
0.77174925 0.73248786 0.78860037 0.67735436]
|
|
|
|
mean value: 0.7336544661308108
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.80701754 0.87719298 0.78947368 0.76785714 0.85714286
|
|
0.82142857 0.78571429 0.85714286 0.89285714]
|
|
|
|
mean value: 0.82453007518797
|
|
|
|
key: train_accuracy
|
|
value: [0.8678501 0.86982249 0.87771203 0.85404339 0.84448819 0.86417323
|
|
0.88582677 0.86614173 0.89370079 0.83858268]
|
|
|
|
mean value: 0.8662341393716319
|
|
|
|
key: test_fscore
|
|
value: [0.80645161 0.79245283 0.87719298 0.77777778 0.75471698 0.86206897
|
|
0.83870968 0.77777778 0.84615385 0.89285714]
|
|
|
|
mean value: 0.8226159594183262
|
|
|
|
key: train_fscore
|
|
value: [0.8678501 0.87209302 0.87890625 0.85603113 0.8315565 0.86756238
|
|
0.88671875 0.86770428 0.89655172 0.84046693]
|
|
|
|
mean value: 0.8665441063880106
|
|
|
|
key: test_precision
|
|
value: [0.73529412 0.84 0.89285714 0.84 0.8 0.83333333
|
|
0.76470588 0.80769231 0.91666667 0.89285714]
|
|
|
|
mean value: 0.8323406593406594
|
|
|
|
key: train_precision
|
|
value: [0.86956522 0.85877863 0.86872587 0.84291188 0.90697674 0.84644195
|
|
0.87984496 0.85769231 0.87313433 0.83076923]
|
|
|
|
mean value: 0.8634841109277654
|
|
|
|
key: test_recall
|
|
value: [0.89285714 0.75 0.86206897 0.72413793 0.71428571 0.89285714
|
|
0.92857143 0.75 0.78571429 0.89285714]
|
|
|
|
mean value: 0.8193349753694581
|
|
|
|
key: train_recall
|
|
value: [0.86614173 0.88582677 0.88932806 0.86956522 0.76771654 0.88976378
|
|
0.89370079 0.87795276 0.92125984 0.8503937 ]
|
|
|
|
mean value: 0.8711649186144222
|
|
|
|
key: test_roc_auc
|
|
value: [0.79125616 0.80603448 0.87746305 0.79064039 0.76785714 0.85714286
|
|
0.82142857 0.78571429 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8247536945812808
|
|
|
|
key: train_roc_auc
|
|
value: [0.86785347 0.86979086 0.8777349 0.85407395 0.84448819 0.86417323
|
|
0.88582677 0.86614173 0.89370079 0.83858268]
|
|
|
|
mean value: 0.8662366561887274
|
|
|
|
key: test_jcc
|
|
value: [0.67567568 0.65625 0.78125 0.63636364 0.60606061 0.75757576
|
|
0.72222222 0.63636364 0.73333333 0.80645161]
|
|
|
|
mean value: 0.7011546480498093
|
|
|
|
key: train_jcc
|
|
value: [0.76655052 0.77319588 0.78397213 0.74829932 0.71167883 0.76610169
|
|
0.79649123 0.76632302 0.8125 0.72483221]
|
|
|
|
mean value: 0.7649944838022477
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01057291 0.01041102 0.01036787 0.01058412 0.01035976 0.01047611
|
|
0.01075315 0.01133537 0.01048946 0.01053786]
|
|
|
|
mean value: 0.010588765144348145
|
|
|
|
key: score_time
|
|
value: [0.00910211 0.00900173 0.00892186 0.0090301 0.00900674 0.00920486
|
|
0.00922561 0.00952029 0.00915504 0.00927591]
|
|
|
|
mean value: 0.009144425392150879
|
|
|
|
key: test_mcc
|
|
value: [0.80817326 0.57973205 0.43842365 0.43842365 0.57735027 0.64285714
|
|
0.58501794 0.64285714 0.47187011 0.53605627]
|
|
|
|
mean value: 0.5720761462070288
|
|
|
|
key: train_mcc
|
|
value: [0.59369456 0.64499463 0.64499463 0.63709364 0.62999938 0.56756289
|
|
0.64173726 0.60292787 0.63779528 0.59849942]
|
|
|
|
mean value: 0.6199299551759412
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.78947368 0.71929825 0.71929825 0.78571429 0.82142857
|
|
0.78571429 0.82142857 0.73214286 0.76785714]
|
|
|
|
mean value: 0.7837092731829574
|
|
|
|
key: train_accuracy
|
|
value: [0.79684418 0.82248521 0.82248521 0.81854043 0.81496063 0.78346457
|
|
0.82086614 0.8011811 0.81889764 0.7992126 ]
|
|
|
|
mean value: 0.8098937706751153
|
|
|
|
key: test_fscore
|
|
value: [0.90322581 0.77777778 0.72413793 0.72413793 0.76923077 0.82142857
|
|
0.80645161 0.82142857 0.70588235 0.76363636]
|
|
|
|
mean value: 0.7817337687867034
|
|
|
|
key: train_fscore
|
|
value: [0.79684418 0.82213439 0.82283465 0.81746032 0.81640625 0.77822581
|
|
0.82121807 0.79678068 0.81889764 0.79761905]
|
|
|
|
mean value: 0.8088421032567705
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.80769231 0.72413793 0.72413793 0.83333333 0.82142857
|
|
0.73529412 0.82142857 0.7826087 0.77777778]
|
|
|
|
mean value: 0.7851368648793466
|
|
|
|
key: train_precision
|
|
value: [0.79841897 0.82539683 0.81960784 0.82071713 0.81007752 0.79752066
|
|
0.81960784 0.81481481 0.81889764 0.804 ]
|
|
|
|
mean value: 0.8129059248624415
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.72413793 0.72413793 0.71428571 0.82142857
|
|
0.89285714 0.82142857 0.64285714 0.75 ]
|
|
|
|
mean value: 0.7841133004926109
|
|
|
|
key: train_recall
|
|
value: [0.79527559 0.81889764 0.82608696 0.81422925 0.82283465 0.75984252
|
|
0.82283465 0.77952756 0.81889764 0.79133858]
|
|
|
|
mean value: 0.8049765024431235
|
|
|
|
key: test_roc_auc
|
|
value: [0.89655172 0.7887931 0.71921182 0.71921182 0.78571429 0.82142857
|
|
0.78571429 0.82142857 0.73214286 0.76785714]
|
|
|
|
mean value: 0.7838054187192118
|
|
|
|
key: train_roc_auc
|
|
value: [0.79684728 0.8224923 0.8224923 0.81853195 0.81496063 0.78346457
|
|
0.82086614 0.8011811 0.81889764 0.7992126 ]
|
|
|
|
mean value: 0.8098946500264542
|
|
|
|
key: test_jcc
|
|
value: [0.82352941 0.63636364 0.56756757 0.56756757 0.625 0.6969697
|
|
0.67567568 0.6969697 0.54545455 0.61764706]
|
|
|
|
mean value: 0.6452744857156621
|
|
|
|
key: train_jcc
|
|
value: [0.66229508 0.69798658 0.69899666 0.69127517 0.68976898 0.6369637
|
|
0.69666667 0.66220736 0.69333333 0.66336634]
|
|
|
|
mean value: 0.6792859850212573
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01003718 0.01101255 0.01126194 0.01116419 0.01099014 0.01097584
|
|
0.01107812 0.01125479 0.01113129 0.01112223]
|
|
|
|
mean value: 0.011002826690673827
|
|
|
|
key: score_time
|
|
value: [0.01278234 0.0125072 0.01366591 0.01346803 0.01265216 0.01326346
|
|
0.0132699 0.01311994 0.01899934 0.01349974]
|
|
|
|
mean value: 0.013722801208496093
|
|
|
|
key: test_mcc
|
|
value: [0.62036458 0.54377353 0.45409716 0.66268617 0.57735027 0.68965631
|
|
0.53881591 0.68250015 0.46697379 0.60753044]
|
|
|
|
mean value: 0.5843748306991364
|
|
|
|
key: train_mcc
|
|
value: [0.76941166 0.76071428 0.79980738 0.76082422 0.79356189 0.79286644
|
|
0.77572829 0.76054069 0.74294954 0.78364389]
|
|
|
|
mean value: 0.7740048286091203
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.77192982 0.71929825 0.8245614 0.78571429 0.83928571
|
|
0.76785714 0.83928571 0.73214286 0.80357143]
|
|
|
|
mean value: 0.787312030075188
|
|
|
|
key: train_accuracy
|
|
value: [0.8816568 0.87771203 0.8974359 0.87771203 0.89566929 0.89173228
|
|
0.88582677 0.87795276 0.87007874 0.88976378]
|
|
|
|
mean value: 0.884554038733324
|
|
|
|
key: test_fscore
|
|
value: [0.81818182 0.76363636 0.75757576 0.84375 0.8 0.85245902
|
|
0.77966102 0.84745763 0.71698113 0.8 ]
|
|
|
|
mean value: 0.797970273193065
|
|
|
|
key: train_fscore
|
|
value: [0.88888889 0.88475836 0.90262172 0.88432836 0.89943074 0.89945155
|
|
0.89138577 0.88432836 0.8754717 0.89513109]
|
|
|
|
mean value: 0.8905796538479781
|
|
|
|
key: test_precision
|
|
value: [0.71052632 0.77777778 0.67567568 0.77142857 0.75 0.78787879
|
|
0.74193548 0.80645161 0.76 0.81481481]
|
|
|
|
mean value: 0.7596489040139295
|
|
|
|
key: train_precision
|
|
value: [0.83916084 0.83802817 0.85765125 0.83745583 0.86813187 0.83959044
|
|
0.85 0.84042553 0.84057971 0.85357143]
|
|
|
|
mean value: 0.8464595066564342
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.75 0.86206897 0.93103448 0.85714286 0.92857143
|
|
0.82142857 0.89285714 0.67857143 0.78571429]
|
|
|
|
mean value: 0.847167487684729
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.93700787 0.95256917 0.93675889 0.93307087 0.96850394
|
|
0.93700787 0.93307087 0.91338583 0.94094488]
|
|
|
|
mean value: 0.9397202078989139
|
|
|
|
key: test_roc_auc
|
|
value: [0.79248768 0.77155172 0.71674877 0.8226601 0.78571429 0.83928571
|
|
0.76785714 0.83928571 0.73214286 0.80357143]
|
|
|
|
mean value: 0.7871305418719212
|
|
|
|
key: train_roc_auc
|
|
value: [0.88153185 0.87759485 0.89754443 0.87782827 0.89566929 0.89173228
|
|
0.88582677 0.87795276 0.87007874 0.88976378]
|
|
|
|
mean value: 0.8845523015156702
|
|
|
|
key: test_jcc
|
|
value: [0.69230769 0.61764706 0.6097561 0.72972973 0.66666667 0.74285714
|
|
0.63888889 0.73529412 0.55882353 0.66666667]
|
|
|
|
mean value: 0.6658637590560116
|
|
|
|
key: train_jcc
|
|
value: [0.8 0.79333333 0.8225256 0.79264214 0.81724138 0.81727575
|
|
0.80405405 0.79264214 0.77852349 0.81016949]
|
|
|
|
mean value: 0.8028407373870428
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02933192 0.02366972 0.0223372 0.02245879 0.02203465 0.02228165
|
|
0.02257109 0.02298236 0.02250648 0.02265596]
|
|
|
|
mean value: 0.023282980918884276
|
|
|
|
key: score_time
|
|
value: [0.01401258 0.01203036 0.01217723 0.01243663 0.01200104 0.01210237
|
|
0.01222324 0.0121057 0.01228499 0.01206851]
|
|
|
|
mean value: 0.012344264984130859
|
|
|
|
key: test_mcc
|
|
value: [0.75047877 0.7589669 0.7257422 0.96551724 0.71428571 0.89342711
|
|
0.73127242 0.68250015 0.64450339 0.93094934]
|
|
|
|
mean value: 0.7797643240554251
|
|
|
|
key: train_mcc
|
|
value: [0.83222561 0.85928385 0.83474492 0.8364528 0.84004879 0.88213591
|
|
0.84756752 0.84293789 0.84004879 0.85869374]
|
|
|
|
mean value: 0.8474139833983546
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.87719298 0.85964912 0.98245614 0.85714286 0.94642857
|
|
0.85714286 0.83928571 0.82142857 0.96428571]
|
|
|
|
mean value: 0.8864661654135338
|
|
|
|
key: train_accuracy
|
|
value: [0.91518738 0.92899408 0.91715976 0.91715976 0.91929134 0.94094488
|
|
0.92322835 0.92125984 0.91929134 0.92913386]
|
|
|
|
mean value: 0.9231650592492506
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.88135593 0.87096774 0.98245614 0.85714286 0.94545455
|
|
0.87096774 0.84745763 0.81481481 0.96551724]
|
|
|
|
mean value: 0.8911134642335407
|
|
|
|
key: train_fscore
|
|
value: [0.91809524 0.93103448 0.91828794 0.91984733 0.92160612 0.94163424
|
|
0.92514395 0.92248062 0.92160612 0.93023256]
|
|
|
|
mean value: 0.9249968597409465
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.83870968 0.81818182 1. 0.85714286 0.96296296
|
|
0.79411765 0.80645161 0.84615385 0.93333333]
|
|
|
|
mean value: 0.8634831532934
|
|
|
|
key: train_precision
|
|
value: [0.88929889 0.90671642 0.90421456 0.88929889 0.89591078 0.93076923
|
|
0.90262172 0.90839695 0.89591078 0.91603053]
|
|
|
|
mean value: 0.9039168759145274
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.93103448 0.96551724 0.85714286 0.92857143
|
|
0.96428571 0.89285714 0.78571429 1. ]
|
|
|
|
mean value: 0.9253694581280788
|
|
|
|
key: train_recall
|
|
value: [0.9488189 0.95669291 0.93280632 0.95256917 0.9488189 0.95275591
|
|
0.9488189 0.93700787 0.9488189 0.94488189]
|
|
|
|
mean value: 0.9471989667299493
|
|
|
|
key: test_roc_auc
|
|
value: [0.86206897 0.87807882 0.85837438 0.98275862 0.85714286 0.94642857
|
|
0.85714286 0.83928571 0.82142857 0.96428571]
|
|
|
|
mean value: 0.8866995073891626
|
|
|
|
key: train_roc_auc
|
|
value: [0.91512091 0.92893934 0.91719056 0.91722947 0.91929134 0.94094488
|
|
0.92322835 0.92125984 0.91929134 0.92913386]
|
|
|
|
mean value: 0.9231629890137251
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.78787879 0.77142857 0.96551724 0.75 0.89655172
|
|
0.77142857 0.73529412 0.6875 0.93333333]
|
|
|
|
mean value: 0.8076710125011343
|
|
|
|
key: train_jcc
|
|
value: [0.84859155 0.87096774 0.84892086 0.85159011 0.85460993 0.88970588
|
|
0.86071429 0.85611511 0.85460993 0.86956522]
|
|
|
|
mean value: 0.8605390612075907
|
|
|
|
MCC on Blind test: 0.66
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.33146954 2.55629253 2.25351214 2.16796517 2.04874682 2.0742538
|
|
2.40498137 2.58180976 2.5396452 2.15567756]
|
|
|
|
mean value: 2.3114353895187376
|
|
|
|
key: score_time
|
|
value: [0.01615882 0.01403785 0.01418424 0.026016 0.01434445 0.01499295
|
|
0.03440094 0.01265144 0.01818609 0.02401257]
|
|
|
|
mean value: 0.018898534774780273
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.89988258 0.93202124 0.96547546 0.85933785 0.89342711
|
|
0.89802651 0.89802651 0.96490128 0.96490128]
|
|
|
|
mean value: 0.9208183019462288
|
|
|
|
key: train_mcc
|
|
value: [0.99606293 0.99606293 0.99606299 0.99606299 1. 0.99607071
|
|
0.99212598 1. 0.99212598 0.99607071]
|
|
|
|
mean value: 0.9960645238155992
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.96491228 0.98245614 0.92857143 0.94642857
|
|
0.94642857 0.94642857 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9591791979949874
|
|
|
|
key: train_accuracy
|
|
value: [0.99802761 0.99802761 0.99802761 0.99802761 1. 0.9980315
|
|
0.99606299 1. 0.99606299 0.9980315 ]
|
|
|
|
mean value: 0.9980299430026868
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.94915254 0.96666667 0.98305085 0.93103448 0.94545455
|
|
0.94915254 0.94915254 0.98181818 0.98245614]
|
|
|
|
mean value: 0.9603455733004473
|
|
|
|
key: train_fscore
|
|
value: [0.99803536 0.99803536 0.99802761 0.99802761 1. 0.99803536
|
|
0.99606299 1. 0.99606299 0.99803536]
|
|
|
|
mean value: 0.9980322664907467
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.90322581 0.93548387 0.96666667 0.9 0.96296296
|
|
0.90322581 0.90322581 1. 0.96551724]
|
|
|
|
mean value: 0.9373641494664854
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.99607843 0.99606299 0.99606299 1. 0.99607843
|
|
0.99606299 1. 0.99606299 0.99607843]
|
|
|
|
mean value: 0.9968565693994134
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.99606299 1. 0.99606299 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.94827586 0.96428571 0.98214286 0.92857143 0.94642857
|
|
0.94642857 0.94642857 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9592364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [0.99802372 0.99802372 0.9980315 0.9980315 1. 0.9980315
|
|
0.99606299 1. 0.99606299 0.9980315 ]
|
|
|
|
mean value: 0.9980299399333976
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.90322581 0.93548387 0.96666667 0.87096774 0.89655172
|
|
0.90322581 0.90322581 0.96428571 0.96551724]
|
|
|
|
mean value: 0.924248371206102
|
|
|
|
key: train_jcc
|
|
value: [0.99607843 0.99607843 0.99606299 0.99606299 1. 0.99607843
|
|
0.99215686 1. 0.99215686 0.99607843]
|
|
|
|
mean value: 0.9960753435232361
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03225613 0.02523661 0.02279449 0.02276397 0.02029443 0.02316093
|
|
0.02214956 0.02145815 0.02218223 0.02365208]
|
|
|
|
mean value: 0.02359485626220703
|
|
|
|
key: score_time
|
|
value: [0.01220226 0.00912762 0.00895953 0.00886941 0.00896859 0.00906873
|
|
0.00915599 0.00904799 0.00930071 0.00928521]
|
|
|
|
mean value: 0.009398603439331054
|
|
|
|
key: test_mcc
|
|
value: [1. 0.9321832 1. 1. 0.73127242 1.
|
|
0.89802651 0.93094934 0.92857143 1. ]
|
|
|
|
mean value: 0.9421002898482495
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96491228 1. 1. 0.85714286 1.
|
|
0.94642857 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9697055137844611
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96551724 1. 1. 0.87096774 1.
|
|
0.94915254 0.96551724 0.96428571 1. ]
|
|
|
|
mean value: 0.9715440481352701
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93333333 1. 1. 0.79411765 1.
|
|
0.90322581 0.93333333 0.96428571 1. ]
|
|
|
|
mean value: 0.9528295834462818
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 1.
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9928571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.96551724 1. 1. 0.85714286 1.
|
|
0.94642857 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9697660098522167
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.93333333 1. 1. 0.77142857 1.
|
|
0.90322581 0.93333333 0.93103448 1. ]
|
|
|
|
mean value: 0.9472355527305472
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12735271 0.1331346 0.13445926 0.12076545 0.1190021 0.1195786
|
|
0.11903858 0.12229443 0.12137938 0.11910081]
|
|
|
|
mean value: 0.12361059188842774
|
|
|
|
key: score_time
|
|
value: [0.0200398 0.02035165 0.02030778 0.01841545 0.01809835 0.01818275
|
|
0.01851726 0.01960111 0.01816249 0.01938963]
|
|
|
|
mean value: 0.019106626510620117
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 0.96547546 1. 0.89342711 0.89342711
|
|
0.93094934 0.96490128 0.96490128 1. ]
|
|
|
|
mean value: 0.9544116062967756
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.98245614 1. 0.94642857 0.94642857
|
|
0.96428571 0.98214286 0.98214286 1. ]
|
|
|
|
mean value: 0.9768796992481202
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 0.98305085 1. 0.94736842 0.94545455
|
|
0.96551724 0.98245614 0.98181818 1. ]
|
|
|
|
mean value: 0.9770577658214927
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.96551724 0.96666667 1. 0.93103448 0.96296296
|
|
0.93333333 0.96551724 1. 1. ]
|
|
|
|
mean value: 0.9690549169859515
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 0.98214286 1. 0.94642857 0.94642857
|
|
0.96428571 0.98214286 0.98214286 1. ]
|
|
|
|
mean value: 0.976908866995074
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 0.96666667 1. 0.9 0.89655172
|
|
0.93333333 0.96551724 0.96428571 1. ]
|
|
|
|
mean value: 0.9557389162561577
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01039386 0.01049614 0.01059151 0.01052785 0.01056314 0.01058316
|
|
0.0104866 0.01042509 0.01065969 0.01042533]
|
|
|
|
mean value: 0.010515236854553222
|
|
|
|
key: score_time
|
|
value: [0.00886655 0.00900602 0.00947428 0.00894213 0.00898314 0.00901628
|
|
0.00893044 0.00892806 0.0092063 0.0089128 ]
|
|
|
|
mean value: 0.009026598930358887
|
|
|
|
key: test_mcc
|
|
value: [0.77903565 0.86189955 0.74822828 0.74822828 0.79385662 0.78772636
|
|
0.6882472 0.89802651 0.82618439 0.8660254 ]
|
|
|
|
mean value: 0.7997458248025408
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.92982456 0.85964912 0.85964912 0.89285714 0.89285714
|
|
0.82142857 0.94642857 0.91071429 0.92857143]
|
|
|
|
mean value: 0.8919172932330827
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.93103448 0.87878788 0.87878788 0.9 0.89655172
|
|
0.84848485 0.94915254 0.91525424 0.93333333]
|
|
|
|
mean value: 0.9020275814840397
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.9 0.78378378 0.78378378 0.84375 0.86666667
|
|
0.73684211 0.90322581 0.87096774 0.875 ]
|
|
|
|
mean value: 0.8364019887884488
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96428571 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9821428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.87931034 0.93041872 0.85714286 0.85714286 0.89285714 0.89285714
|
|
0.82142857 0.94642857 0.91071429 0.92857143]
|
|
|
|
mean value: 0.8916871921182267
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.87096774 0.78378378 0.78378378 0.81818182 0.8125
|
|
0.73684211 0.90322581 0.84375 0.875 ]
|
|
|
|
mean value: 0.822803503939964
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.72128773 1.77692485 1.74783516 1.83495116 1.73265433 1.9805038
|
|
1.75580668 1.82116747 1.85459757 1.74639463]
|
|
|
|
mean value: 1.797212338447571
|
|
|
|
key: score_time
|
|
value: [0.10111117 0.09629679 0.09658933 0.11028838 0.09789872 0.10091949
|
|
0.10473514 0.12026238 0.09599161 0.09527445]
|
|
|
|
mean value: 0.10193674564361573
|
|
|
|
key: test_mcc
|
|
value: [1. 0.96551724 1. 1. 0.89342711 0.93094934
|
|
0.93094934 0.93094934 0.92857143 1. ]
|
|
|
|
mean value: 0.9580363791069356
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98245614 1. 1. 0.94642857 0.96428571
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9786027568922305
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98245614 1. 1. 0.94736842 0.96296296
|
|
0.96551724 0.96551724 0.96428571 1. ]
|
|
|
|
mean value: 0.9788107721410807
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96551724 1. 1. 0.93103448 1.
|
|
0.93333333 0.93333333 0.96428571 1. ]
|
|
|
|
mean value: 0.9727504105090312
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 1. 1. 0.94642857 0.96428571
|
|
0.96428571 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.9786330049261084
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.96551724 1. 1. 0.9 0.92857143
|
|
0.93333333 0.93333333 0.93103448 1. ]
|
|
|
|
mean value: 0.9591789819376026
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.93988156 0.98185563 0.95095325 0.95623612 1.0021174 1.01571774
|
|
0.96705437 0.9789834 1.06533718 0.96144629]
|
|
|
|
mean value: 0.9819582939147949
|
|
|
|
key: score_time
|
|
value: [0.17660475 0.25053692 0.26670814 0.16711521 0.25986362 0.19897556
|
|
0.23333478 0.21554923 0.25029039 0.24634266]
|
|
|
|
mean value: 0.22653212547302246
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.8951918 1. 1. 0.93094934 0.93094934
|
|
0.93094934 0.93094934 1. 1. ]
|
|
|
|
mean value: 0.958450638898162
|
|
|
|
key: train_mcc
|
|
value: [0.97660378 0.9685613 0.98046755 0.97275888 0.98437404 0.98050495
|
|
0.98437404 0.98437404 0.98050495 0.98050495]
|
|
|
|
mean value: 0.9793028481235929
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 1. 1. 0.96428571 0.96428571
|
|
0.96428571 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9786967418546366
|
|
|
|
key: train_accuracy
|
|
value: [0.98816568 0.98422091 0.99013807 0.98619329 0.99212598 0.99015748
|
|
0.99212598 0.99212598 0.99015748 0.99015748]
|
|
|
|
mean value: 0.9895568342418737
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94545455 1. 1. 0.96551724 0.96296296
|
|
0.96551724 0.96551724 1. 1. ]
|
|
|
|
mean value: 0.9787425372906317
|
|
|
|
key: train_fscore
|
|
value: [0.98832685 0.984375 0.99021526 0.98635478 0.9921875 0.99025341
|
|
0.9921875 0.9921875 0.99025341 0.99025341]
|
|
|
|
mean value: 0.9896594622183483
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.96296296 1. 1. 0.93333333 1.
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9728480204342274
|
|
|
|
key: train_precision
|
|
value: [0.97692308 0.97674419 0.98062016 0.97307692 0.98449612 0.98069498
|
|
0.98449612 0.98449612 0.98069498 0.98069498]
|
|
|
|
mean value: 0.9802937655263236
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 1. 0.92857143
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 0.99212598 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.94704433 1. 1. 0.96428571 0.96428571
|
|
0.96428571 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9786945812807882
|
|
|
|
key: train_roc_auc
|
|
value: [0.98814229 0.98420528 0.99015748 0.98622047 0.99212598 0.99015748
|
|
0.99212598 0.99212598 0.99015748 0.99015748]
|
|
|
|
mean value: 0.9895575923562915
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.89655172 1. 1. 0.93333333 0.92857143
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.959064039408867
|
|
|
|
key: train_jcc
|
|
value: [0.97692308 0.96923077 0.98062016 0.97307692 0.98449612 0.98069498
|
|
0.98449612 0.98449612 0.98069498 0.98069498]
|
|
|
|
mean value: 0.9795424238447494
|
|
|
|
MCC on Blind test: 0.83
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02351713 0.01024127 0.01087022 0.01184297 0.01054907 0.01104808
|
|
0.01173329 0.01041317 0.0110364 0.01116943]
|
|
|
|
mean value: 0.01224210262298584
|
|
|
|
key: score_time
|
|
value: [0.0089438 0.00970793 0.00972652 0.00989699 0.00950623 0.00932384
|
|
0.0098536 0.00955272 0.00994015 0.00911736]
|
|
|
|
mean value: 0.009556913375854492
|
|
|
|
key: test_mcc
|
|
value: [0.80817326 0.57973205 0.43842365 0.43842365 0.57735027 0.64285714
|
|
0.58501794 0.64285714 0.47187011 0.53605627]
|
|
|
|
mean value: 0.5720761462070288
|
|
|
|
key: train_mcc
|
|
value: [0.59369456 0.64499463 0.64499463 0.63709364 0.62999938 0.56756289
|
|
0.64173726 0.60292787 0.63779528 0.59849942]
|
|
|
|
mean value: 0.6199299551759412
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.78947368 0.71929825 0.71929825 0.78571429 0.82142857
|
|
0.78571429 0.82142857 0.73214286 0.76785714]
|
|
|
|
mean value: 0.7837092731829574
|
|
|
|
key: train_accuracy
|
|
value: [0.79684418 0.82248521 0.82248521 0.81854043 0.81496063 0.78346457
|
|
0.82086614 0.8011811 0.81889764 0.7992126 ]
|
|
|
|
mean value: 0.8098937706751153
|
|
|
|
key: test_fscore
|
|
value: [0.90322581 0.77777778 0.72413793 0.72413793 0.76923077 0.82142857
|
|
0.80645161 0.82142857 0.70588235 0.76363636]
|
|
|
|
mean value: 0.7817337687867034
|
|
|
|
key: train_fscore
|
|
value: [0.79684418 0.82213439 0.82283465 0.81746032 0.81640625 0.77822581
|
|
0.82121807 0.79678068 0.81889764 0.79761905]
|
|
|
|
mean value: 0.8088421032567705
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.80769231 0.72413793 0.72413793 0.83333333 0.82142857
|
|
0.73529412 0.82142857 0.7826087 0.77777778]
|
|
|
|
mean value: 0.7851368648793466
|
|
|
|
key: train_precision
|
|
value: [0.79841897 0.82539683 0.81960784 0.82071713 0.81007752 0.79752066
|
|
0.81960784 0.81481481 0.81889764 0.804 ]
|
|
|
|
mean value: 0.8129059248624415
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.72413793 0.72413793 0.71428571 0.82142857
|
|
0.89285714 0.82142857 0.64285714 0.75 ]
|
|
|
|
mean value: 0.7841133004926109
|
|
|
|
key: train_recall
|
|
value: [0.79527559 0.81889764 0.82608696 0.81422925 0.82283465 0.75984252
|
|
0.82283465 0.77952756 0.81889764 0.79133858]
|
|
|
|
mean value: 0.8049765024431235
|
|
|
|
key: test_roc_auc
|
|
value: [0.89655172 0.7887931 0.71921182 0.71921182 0.78571429 0.82142857
|
|
0.78571429 0.82142857 0.73214286 0.76785714]
|
|
|
|
mean value: 0.7838054187192118
|
|
|
|
key: train_roc_auc
|
|
value: [0.79684728 0.8224923 0.8224923 0.81853195 0.81496063 0.78346457
|
|
0.82086614 0.8011811 0.81889764 0.7992126 ]
|
|
|
|
mean value: 0.8098946500264542
|
|
|
|
key: test_jcc
|
|
value: [0.82352941 0.63636364 0.56756757 0.56756757 0.625 0.6969697
|
|
0.67567568 0.6969697 0.54545455 0.61764706]
|
|
|
|
mean value: 0.6452744857156621
|
|
|
|
key: train_jcc
|
|
value: [0.66229508 0.69798658 0.69899666 0.69127517 0.68976898 0.6369637
|
|
0.69666667 0.66220736 0.69333333 0.66336634]
|
|
|
|
mean value: 0.6792859850212573
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08453417 0.06654429 0.07418919 0.08602238 0.06946802 0.08242369
|
|
0.06857514 0.07357121 0.07413006 0.07933426]
|
|
|
|
mean value: 0.07587924003601074
|
|
|
|
key: score_time
|
|
value: [0.01240754 0.01080799 0.01115561 0.01121211 0.01110983 0.01113629
|
|
0.01076126 0.01157236 0.01180458 0.0112443 ]
|
|
|
|
mean value: 0.011321187019348145
|
|
|
|
key: test_mcc
|
|
value: [1. 0.9321832 1. 0.96547546 0.89802651 1.
|
|
0.96490128 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.965643706501215
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96491228 1. 0.98245614 0.94642857 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9822368421052632
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96551724 1. 0.98305085 0.94915254 1.
|
|
0.98245614 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.9828150153290883
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93333333 1. 0.96666667 0.90322581 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9667593622543567
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.96551724 1. 0.98214286 0.94642857 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9822660098522168
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.93333333 1. 0.96666667 0.90322581 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9667593622543567
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05864573 0.08753681 0.07722974 0.0898056 0.0872829 0.07127666
|
|
0.08494568 0.06207895 0.05319452 0.06795883]
|
|
|
|
mean value: 0.07399554252624511
|
|
|
|
key: score_time
|
|
value: [0.01858926 0.01282024 0.02478695 0.01922059 0.01901817 0.01264048
|
|
0.01914215 0.01234865 0.01900911 0.01905823]
|
|
|
|
mean value: 0.01766338348388672
|
|
|
|
key: test_mcc
|
|
value: [0.86851042 0.82512315 0.8953202 0.86789789 0.82195294 0.96490128
|
|
0.82618439 0.82618439 0.96490128 0.96490128]
|
|
|
|
mean value: 0.8825877226392335
|
|
|
|
key: train_mcc
|
|
value: [0.96055211 0.97239383 0.95266254 0.95661511 0.96850394 0.95670033
|
|
0.96850394 0.97250878 0.96062992 0.96062992]
|
|
|
|
mean value: 0.9629700415209543
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.9122807 0.94736842 0.92982456 0.91071429 0.98214286
|
|
0.91071429 0.91071429 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9397869674185464
|
|
|
|
key: train_accuracy
|
|
value: [0.98027613 0.98619329 0.97633136 0.97830375 0.98425197 0.97834646
|
|
0.98425197 0.98622047 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9814805323890727
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.9122807 0.94736842 0.93548387 0.90909091 0.98245614
|
|
0.91525424 0.91525424 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9415434131827904
|
|
|
|
key: train_fscore
|
|
value: [0.98031496 0.98624754 0.97628458 0.97830375 0.98425197 0.978389
|
|
0.98425197 0.98613861 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9814812307513464
|
|
|
|
key: test_precision
|
|
value: [0.875 0.89655172 0.96428571 0.87878788 0.92592593 0.96551724
|
|
0.87096774 0.87096774 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9179038451146349
|
|
|
|
key: train_precision
|
|
value: [0.98031496 0.98431373 0.97628458 0.97637795 0.98425197 0.97647059
|
|
0.98425197 0.99203187 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9814927542869231
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.93103448 1. 0.89285714 1.
|
|
0.96428571 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.968103448275862
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98818898 0.97628458 0.98023715 0.98425197 0.98031496
|
|
0.98425197 0.98031496 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9814789455665868
|
|
|
|
key: test_roc_auc
|
|
value: [0.93103448 0.91256158 0.9476601 0.92857143 0.91071429 0.98214286
|
|
0.91071429 0.91071429 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9398399014778326
|
|
|
|
key: train_roc_auc
|
|
value: [0.98027606 0.98618935 0.97633127 0.97830755 0.98425197 0.97834646
|
|
0.98425197 0.98622047 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9814805016961813
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.83870968 0.9 0.87878788 0.83333333 0.96551724
|
|
0.84375 0.84375 0.96551724 0.96551724]
|
|
|
|
mean value: 0.8909882613678498
|
|
|
|
key: train_jcc
|
|
value: [0.96138996 0.97286822 0.95366795 0.95752896 0.96899225 0.95769231
|
|
0.96899225 0.97265625 0.96138996 0.96138996]
|
|
|
|
mean value: 0.9636568066237398
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01440048 0.01289797 0.00995111 0.0097754 0.00982618 0.01017928
|
|
0.00993538 0.00999284 0.01034856 0.00996208]
|
|
|
|
mean value: 0.0107269287109375
|
|
|
|
key: score_time
|
|
value: [0.01222897 0.00954247 0.00876284 0.0087254 0.00896811 0.00879383
|
|
0.00874734 0.00874972 0.00899076 0.00880122]
|
|
|
|
mean value: 0.009231066703796387
|
|
|
|
key: test_mcc
|
|
value: [0.64889453 0.64901478 0.6166424 0.58076493 0.50128041 0.61065803
|
|
0.52174919 0.50128041 0.53881591 0.67900461]
|
|
|
|
mean value: 0.5848105187193532
|
|
|
|
key: train_mcc
|
|
value: [0.61406315 0.6462136 0.65745192 0.5827872 0.67819632 0.57508846
|
|
0.6265721 0.67097829 0.68811802 0.54745203]
|
|
|
|
mean value: 0.6286921092607803
|
|
|
|
key: test_accuracy
|
|
value: [0.80701754 0.8245614 0.80701754 0.78947368 0.75 0.80357143
|
|
0.75 0.75 0.76785714 0.83928571]
|
|
|
|
mean value: 0.7888784461152882
|
|
|
|
key: train_accuracy
|
|
value: [0.80670611 0.82248521 0.82840237 0.79092702 0.83858268 0.78740157
|
|
0.81299213 0.83464567 0.84251969 0.77362205]
|
|
|
|
mean value: 0.813828448958673
|
|
|
|
key: test_fscore
|
|
value: [0.83076923 0.82142857 0.81967213 0.78571429 0.74074074 0.81355932
|
|
0.78125 0.75862069 0.75471698 0.84210526]
|
|
|
|
mean value: 0.7948577215779411
|
|
|
|
key: train_fscore
|
|
value: [0.81153846 0.82824427 0.83172147 0.79615385 0.84291188 0.79069767
|
|
0.81695568 0.84030418 0.84962406 0.77669903]
|
|
|
|
mean value: 0.8184850560127853
|
|
|
|
key: test_precision
|
|
value: [0.72972973 0.82142857 0.78125 0.81481481 0.76923077 0.77419355
|
|
0.69444444 0.73333333 0.8 0.82758621]
|
|
|
|
mean value: 0.7746011418265312
|
|
|
|
key: train_precision
|
|
value: [0.79323308 0.8037037 0.81439394 0.7752809 0.82089552 0.77862595
|
|
0.8 0.8125 0.81294964 0.76628352]
|
|
|
|
mean value: 0.7977866266459333
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.82142857 0.86206897 0.75862069 0.71428571 0.85714286
|
|
0.89285714 0.78571429 0.71428571 0.85714286]
|
|
|
|
mean value: 0.8227832512315271
|
|
|
|
key: train_recall
|
|
value: [0.83070866 0.85433071 0.84980237 0.81818182 0.86614173 0.80314961
|
|
0.83464567 0.87007874 0.88976378 0.78740157]
|
|
|
|
mean value: 0.8404204662164265
|
|
|
|
key: test_roc_auc
|
|
value: [0.80972906 0.82450739 0.80603448 0.79002463 0.75 0.80357143
|
|
0.75 0.75 0.76785714 0.83928571]
|
|
|
|
mean value: 0.7891009852216748
|
|
|
|
key: train_roc_auc
|
|
value: [0.80665868 0.82242227 0.82844449 0.79098067 0.83858268 0.78740157
|
|
0.81299213 0.83464567 0.84251969 0.77362205]
|
|
|
|
mean value: 0.8138269895116865
|
|
|
|
key: test_jcc
|
|
value: [0.71052632 0.6969697 0.69444444 0.64705882 0.58823529 0.68571429
|
|
0.64102564 0.61111111 0.60606061 0.72727273]
|
|
|
|
mean value: 0.6608418946035045
|
|
|
|
key: train_jcc
|
|
value: [0.6828479 0.70684039 0.71192053 0.66134185 0.72847682 0.65384615
|
|
0.69055375 0.72459016 0.73856209 0.63492063]
|
|
|
|
mean value: 0.6933900281480951
|
|
|
|
MCC on Blind test: 0.59
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01517582 0.02547216 0.03090811 0.02699566 0.02723742 0.02942634
|
|
0.02590513 0.02685237 0.03164697 0.02517891]
|
|
|
|
mean value: 0.026479887962341308
|
|
|
|
key: score_time
|
|
value: [0.01010776 0.01122403 0.01200104 0.01206923 0.01203632 0.01194191
|
|
0.01193762 0.01199579 0.01200461 0.01850152]
|
|
|
|
mean value: 0.012381982803344727
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.82512315 0.89988258 0.96551724 0.78772636 0.96490128
|
|
0.79385662 0.89802651 0.89342711 0.92857143]
|
|
|
|
mean value: 0.8922549527400198
|
|
|
|
key: train_mcc
|
|
value: [0.90714511 0.97239383 0.942062 0.94550473 0.95687833 0.93843444
|
|
0.95687833 0.98050495 0.97649905 0.96463421]
|
|
|
|
mean value: 0.954093498873719
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.9122807 0.94736842 0.98245614 0.89285714 0.98214286
|
|
0.89285714 0.94642857 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9449561403508772
|
|
|
|
key: train_accuracy
|
|
value: [0.95266272 0.98619329 0.9704142 0.97238659 0.97834646 0.96850394
|
|
0.97834646 0.99015748 0.98818898 0.98228346]
|
|
|
|
mean value: 0.9767483576387271
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.9122807 0.94545455 0.98245614 0.89655172 0.98245614
|
|
0.9 0.94915254 0.94736842 0.96428571]
|
|
|
|
mean value: 0.9462462070110721
|
|
|
|
key: train_fscore
|
|
value: [0.95121951 0.98624754 0.96957404 0.97177419 0.9785575 0.96934866
|
|
0.9785575 0.99025341 0.98828125 0.98217822]
|
|
|
|
mean value: 0.9765991834337232
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.89655172 1. 1. 0.86666667 0.96551724
|
|
0.84375 0.90322581 0.93103448 0.96428571]
|
|
|
|
mean value: 0.9336548877059166
|
|
|
|
key: train_precision
|
|
value: [0.98319328 0.98431373 0.99583333 0.99176955 0.96911197 0.94402985
|
|
0.96911197 0.98069498 0.98062016 0.98804781]
|
|
|
|
mean value: 0.9786726616928444
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.89655172 0.96551724 0.92857143 1.
|
|
0.96428571 1. 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9612068965517242
|
|
|
|
key: train_recall
|
|
value: [0.92125984 0.98818898 0.94466403 0.95256917 0.98818898 0.99606299
|
|
0.98818898 1. 0.99606299 0.97637795]
|
|
|
|
mean value: 0.9751563910242446
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.91256158 0.94827586 0.98275862 0.89285714 0.98214286
|
|
0.89285714 0.94642857 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9451354679802956
|
|
|
|
key: train_roc_auc
|
|
value: [0.95272478 0.98618935 0.97036351 0.97234758 0.97834646 0.96850394
|
|
0.97834646 0.99015748 0.98818898 0.98228346]
|
|
|
|
mean value: 0.9767451993402011
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.83870968 0.89655172 0.96551724 0.8125 0.96551724
|
|
0.81818182 0.90322581 0.9 0.93103448]
|
|
|
|
mean value: 0.8996755233087269
|
|
|
|
key: train_jcc
|
|
value: [0.90697674 0.97286822 0.94094488 0.94509804 0.95801527 0.94052045
|
|
0.95801527 0.98069498 0.97683398 0.96498054]
|
|
|
|
mean value: 0.9544948365069599
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02158213 0.01965857 0.02065611 0.01798463 0.01837373 0.01744699
|
|
0.01999712 0.0184269 0.02369738 0.02024031]
|
|
|
|
mean value: 0.019806385040283203
|
|
|
|
key: score_time
|
|
value: [0.01195168 0.01195455 0.01196837 0.01192522 0.01188898 0.011935
|
|
0.01193929 0.01191378 0.01197457 0.01197529]
|
|
|
|
mean value: 0.011942672729492187
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.64058163 0.89952865 1. 0.75434227 0.89802651
|
|
0.74535599 0.8660254 0.75047877 0.93094934]
|
|
|
|
mean value: 0.841747176363628
|
|
|
|
key: train_mcc
|
|
value: [0.94524716 0.82552467 0.88136732 0.90342654 0.89677099 0.92228969
|
|
0.92779624 0.92985478 0.98038334 0.89990029]
|
|
|
|
mean value: 0.9112561021351867
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.78947368 0.94736842 1. 0.875 0.94642857
|
|
0.85714286 0.92857143 0.875 0.96428571]
|
|
|
|
mean value: 0.9148182957393484
|
|
|
|
key: train_accuracy
|
|
value: [0.97238659 0.90532544 0.93885602 0.95069034 0.94685039 0.96062992
|
|
0.96259843 0.96456693 0.99015748 0.9488189 ]
|
|
|
|
mean value: 0.9540880429887092
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.82352941 0.95081967 1. 0.88135593 0.94339623
|
|
0.875 0.93333333 0.87272727 0.96551724]
|
|
|
|
mean value: 0.9211196331333564
|
|
|
|
key: train_fscore
|
|
value: [0.972 0.91366906 0.94139887 0.95219885 0.9489603 0.95967742
|
|
0.96394687 0.96525097 0.99021526 0.95057034]
|
|
|
|
mean value: 0.9557887945831837
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.7 0.90625 1. 0.83870968 1.
|
|
0.77777778 0.875 0.88888889 0.93333333]
|
|
|
|
mean value: 0.8853293010752689
|
|
|
|
key: train_precision
|
|
value: [0.98780488 0.8410596 0.90217391 0.92222222 0.91272727 0.98347107
|
|
0.93040293 0.9469697 0.9844358 0.91911765]
|
|
|
|
mean value: 0.9330385035167746
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.92857143 0.89285714
|
|
1. 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9678571428571429
|
|
|
|
key: train_recall
|
|
value: [0.95669291 1. 0.98418972 0.98418972 0.98818898 0.93700787
|
|
1. 0.98425197 0.99606299 0.98425197]
|
|
|
|
mean value: 0.9814836139553702
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.79310345 0.94642857 1. 0.875 0.94642857
|
|
0.85714286 0.92857143 0.875 0.96428571]
|
|
|
|
mean value: 0.9151477832512316
|
|
|
|
key: train_roc_auc
|
|
value: [0.9724176 0.90513834 0.93894526 0.95075628 0.94685039 0.96062992
|
|
0.96259843 0.96456693 0.99015748 0.9488189 ]
|
|
|
|
mean value: 0.9540879524446796
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.7 0.90625 1. 0.78787879 0.89285714
|
|
0.77777778 0.875 0.77419355 0.93333333]
|
|
|
|
mean value: 0.8580623923567472
|
|
|
|
key: train_jcc
|
|
value: [0.94552529 0.8410596 0.88928571 0.90875912 0.9028777 0.92248062
|
|
0.93040293 0.93283582 0.98062016 0.9057971 ]
|
|
|
|
mean value: 0.9159644058634359
|
|
|
|
MCC on Blind test: 0.73
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18949628 0.16845989 0.17046356 0.17315173 0.18008161 0.17156911
|
|
0.17026114 0.16903186 0.18008924 0.17638707]
|
|
|
|
mean value: 0.17489914894104003
|
|
|
|
key: score_time
|
|
value: [0.01661611 0.01543522 0.01551104 0.01576376 0.01544642 0.01590562
|
|
0.01548719 0.01546669 0.01654816 0.01654792]
|
|
|
|
mean value: 0.015872812271118163
|
|
|
|
key: test_mcc
|
|
value: [1. 0.96551724 1. 0.96547546 0.83484711 1.
|
|
0.96490128 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.962659170679551
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98245614 1. 0.98245614 0.91071429 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9804197994987468
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98245614 1. 0.98305085 0.91803279 1.
|
|
0.98245614 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.9813969296774815
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96551724 1. 0.96666667 0.84848485 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9645036572622779
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 1. 0.98214286 0.91071429 1.
|
|
0.98214286 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9804187192118227
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.96551724 1. 0.96666667 0.84848485 1.
|
|
0.96551724 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9645036572622779
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05747509 0.07012534 0.08541584 0.09682059 0.08754301 0.08933735
|
|
0.07895207 0.0594759 0.07882285 0.08167696]
|
|
|
|
mean value: 0.07856450080871583
|
|
|
|
key: score_time
|
|
value: [0.01824808 0.03937316 0.03136182 0.03505492 0.03816462 0.03392482
|
|
0.03387403 0.02909517 0.02104235 0.0223937 ]
|
|
|
|
mean value: 0.030253267288208006
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8951918 1. 0.96547546 0.83484711 1.
|
|
0.93094934 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.9522314322910705
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99211042 0.99214142 0.99606299 1. 1.
|
|
0.99607071 0.99607071 0.99607071 0.99215674]
|
|
|
|
mean value: 0.9960683715267692
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94736842 1. 0.98245614 0.91071429 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9751253132832081
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99605523 0.99605523 0.99802761 1. 1.
|
|
0.9980315 0.9980315 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9980295547376105
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94545455 1. 0.98305085 0.91803279 1.
|
|
0.96551724 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.9760028802906916
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99606299 0.99606299 0.99802761 1. 1.
|
|
0.99803536 0.99803536 0.99803536 0.99607843]
|
|
|
|
mean value: 0.9980338119410027
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 1. 0.96666667 0.84848485 1.
|
|
0.93333333 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9610298386160455
|
|
|
|
key: train_precision
|
|
value: [1. 0.99606299 0.99215686 0.99606299 1. 1.
|
|
0.99607843 0.99607843 0.99607843 0.9921875 ]
|
|
|
|
mean value: 0.9964705641114714
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9928571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 0.99606299 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9996062992125985
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.94704433 1. 0.98214286 0.91071429 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9750615763546798
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99605521 0.99606299 0.9980315 1. 1.
|
|
0.9980315 0.9980315 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9980307179981949
|
|
|
|
key: test_jcc
|
|
value: [1. 0.89655172 1. 0.96666667 0.84848485 1.
|
|
0.93333333 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9543887147335424
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99215686 0.99215686 0.99606299 1. 1.
|
|
0.99607843 0.99607843 0.99607843 0.9921875 ]
|
|
|
|
mean value: 0.9960799511733828
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17560363 0.20586491 0.16315055 0.14851332 0.21609211 0.3052907
|
|
0.25830936 0.24186087 0.25000715 0.23897648]
|
|
|
|
mean value: 0.220366907119751
|
|
|
|
key: score_time
|
|
value: [0.02612948 0.02948737 0.01560521 0.02712798 0.03579521 0.02561378
|
|
0.02915239 0.02643967 0.0419426 0.02583075]
|
|
|
|
mean value: 0.02831244468688965
|
|
|
|
key: test_mcc
|
|
value: [0.83797038 0.82942474 0.77728159 0.96547546 0.76225171 0.82195294
|
|
0.80439967 0.93094934 0.89342711 0.96490128]
|
|
|
|
mean value: 0.8588034211725772
|
|
|
|
key: train_mcc
|
|
value: [0.98434291 0.98434291 0.98823511 0.98823511 0.98437404 0.99607071
|
|
0.98437404 0.98437404 0.98825791 0.99215674]
|
|
|
|
mean value: 0.9874763527102344
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.9122807 0.87719298 0.98245614 0.875 0.91071429
|
|
0.89285714 0.96428571 0.94642857 0.98214286]
|
|
|
|
mean value: 0.925563909774436
|
|
|
|
key: train_accuracy
|
|
value: [0.99211045 0.99211045 0.99408284 0.99408284 0.99212598 0.9980315
|
|
0.99212598 0.99212598 0.99409449 0.99606299]
|
|
|
|
mean value: 0.9936953516905062
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.91525424 0.89230769 0.98305085 0.8852459 0.9122807
|
|
0.90322581 0.96551724 0.94736842 0.98245614]
|
|
|
|
mean value: 0.9304739776566863
|
|
|
|
key: train_fscore
|
|
value: [0.9921875 0.9921875 0.99410609 0.99410609 0.9921875 0.99803536
|
|
0.9921875 0.9921875 0.99412916 0.99607843]
|
|
|
|
mean value: 0.9937392634089591
|
|
|
|
key: test_precision
|
|
value: [0.84848485 0.87096774 0.80555556 0.96666667 0.81818182 0.89655172
|
|
0.82352941 0.93333333 0.93103448 0.96551724]
|
|
|
|
mean value: 0.8859822824198275
|
|
|
|
key: train_precision
|
|
value: [0.98449612 0.98449612 0.98828125 0.98828125 0.98449612 0.99607843
|
|
0.98449612 0.98449612 0.98832685 0.9921875 ]
|
|
|
|
mean value: 0.9875635899776615
|
|
|
|
key: test_recall
|
|
value: [1. 0.96428571 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9821428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9137931 0.91317734 0.875 0.98214286 0.875 0.91071429
|
|
0.89285714 0.96428571 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9255541871921182
|
|
|
|
key: train_roc_auc
|
|
value: [0.99209486 0.99209486 0.99409449 0.99409449 0.99212598 0.9980315
|
|
0.99212598 0.99212598 0.99409449 0.99606299]
|
|
|
|
mean value: 0.9936945628831969
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.84375 0.80555556 0.96666667 0.79411765 0.83870968
|
|
0.82352941 0.93333333 0.9 0.96551724]
|
|
|
|
mean value: 0.8719664381662598
|
|
|
|
key: train_jcc
|
|
value: [0.98449612 0.98449612 0.98828125 0.98828125 0.98449612 0.99607843
|
|
0.98449612 0.98449612 0.98832685 0.9921875 ]
|
|
|
|
mean value: 0.9875635899776615
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.67622995 0.6558485 0.66100764 0.65075469 0.66132975 0.65230036
|
|
0.65977788 0.65869689 0.66375089 0.65631771]
|
|
|
|
mean value: 0.6596014261245727
|
|
|
|
key: score_time
|
|
value: [0.00952291 0.00961614 0.0093987 0.00940108 0.00963163 0.00945449
|
|
0.00950718 0.00944567 0.00945544 0.0094955 ]
|
|
|
|
mean value: 0.009492874145507812
|
|
|
|
key: test_mcc
|
|
value: [1. 0.96551724 1. 1. 0.89802651 1.
|
|
0.93094934 0.93094934 0.96490128 1. ]
|
|
|
|
mean value: 0.9690343705369726
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98245614 1. 1. 0.94642857 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9839598997493735
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98245614 1. 1. 0.94915254 1.
|
|
0.96551724 0.96551724 0.98245614 1. ]
|
|
|
|
mean value: 0.9845099305833256
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96551724 1. 1. 0.90322581 1.
|
|
0.93333333 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9700926955876901
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98275862 1. 1. 0.94642857 1.
|
|
0.96428571 0.96428571 0.98214286 1. ]
|
|
|
|
mean value: 0.9839901477832512
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.96551724 1. 1. 0.90322581 1.
|
|
0.93333333 0.93333333 0.96551724 1. ]
|
|
|
|
mean value: 0.9700926955876901
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03066301 0.03192091 0.03159785 0.03225589 0.03112817 0.0315671
|
|
0.05233073 0.07486224 0.05127001 0.05236316]
|
|
|
|
mean value: 0.041995906829833986
|
|
|
|
key: score_time
|
|
value: [0.01229358 0.01268935 0.01368308 0.01742101 0.01394725 0.01394153
|
|
0.01992059 0.0179832 0.01861358 0.02085924]
|
|
|
|
mean value: 0.016135239601135255
|
|
|
|
key: test_mcc
|
|
value: [1. 0.96547546 0.96547546 1. 0.96490128 0.89342711
|
|
1. 1. 0.96490128 1. ]
|
|
|
|
mean value: 0.9754180588113227
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.98245614 0.98245614 1. 0.98214286 0.94642857
|
|
1. 1. 0.98214286 1. ]
|
|
|
|
mean value: 0.987562656641604
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.98181818 0.98305085 1. 0.98181818 0.94545455
|
|
1. 1. 0.98181818 1. ]
|
|
|
|
mean value: 0.9873959938366718
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96666667 1. 1. 0.96296296
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9929629629629629
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96428571 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9821428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.98214286 0.98214286 1. 0.98214286 0.94642857
|
|
1. 1. 0.98214286 1. ]
|
|
|
|
mean value: 0.9875
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.96428571 0.96666667 1. 0.96428571 0.89655172
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9756075533661741
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.05
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03958082 0.03368187 0.04933643 0.03995538 0.03961086 0.03955531
|
|
0.04075122 0.02445912 0.06261611 0.03506374]
|
|
|
|
mean value: 0.04046108722686768
|
|
|
|
key: score_time
|
|
value: [0.03454447 0.01914907 0.01997089 0.0189991 0.01908541 0.01905084
|
|
0.02588892 0.02134299 0.02253413 0.02127409]
|
|
|
|
mean value: 0.022183990478515624
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.8615634 0.8953202 1. 0.82195294 1.
|
|
0.79385662 0.85933785 0.82195294 0.93094934]
|
|
|
|
mean value: 0.8917116490649181
|
|
|
|
key: train_mcc
|
|
value: [0.95269145 0.9605814 0.94872553 0.96055211 0.96853396 0.95670033
|
|
0.95675965 0.96853396 0.95670033 0.95670033]
|
|
|
|
mean value: 0.9586479053242231
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.92982456 0.94736842 1. 0.91071429 1.
|
|
0.89285714 0.92857143 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9449248120300752
|
|
|
|
key: train_accuracy
|
|
value: [0.97633136 0.98027613 0.97435897 0.98027613 0.98425197 0.97834646
|
|
0.97834646 0.98425197 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9793132367329823
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.92592593 0.94736842 1. 0.9122807 1.
|
|
0.9 0.93103448 0.90909091 0.96551724]
|
|
|
|
mean value: 0.9456734923341094
|
|
|
|
key: train_fscore
|
|
value: [0.97647059 0.98039216 0.97435897 0.98023715 0.98431373 0.978389
|
|
0.97847358 0.98418972 0.97830375 0.978389 ]
|
|
|
|
mean value: 0.9793517647236116
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.96153846 0.96428571 1. 0.89655172 1.
|
|
0.84375 0.9 0.92592593 0.93333333]
|
|
|
|
mean value: 0.93587184925547
|
|
|
|
key: train_precision
|
|
value: [0.97265625 0.9765625 0.97244094 0.98023715 0.98046875 0.97647059
|
|
0.97276265 0.98809524 0.98023715 0.97647059]
|
|
|
|
mean value: 0.9776401813662509
|
|
|
|
key: test_recall
|
|
value: [1. 0.89285714 0.93103448 1. 0.92857143 1.
|
|
0.96428571 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9573891625615764
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98425197 0.97628458 0.98023715 0.98818898 0.98031496
|
|
0.98425197 0.98031496 0.97637795 0.98031496]
|
|
|
|
mean value: 0.9810852447791852
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.92918719 0.9476601 1. 0.91071429 1.
|
|
0.89285714 0.92857143 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9449507389162561
|
|
|
|
key: train_roc_auc
|
|
value: [0.97632349 0.98026828 0.97436276 0.98027606 0.98425197 0.97834646
|
|
0.97834646 0.98425197 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9793120351062837
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.86206897 0.9 1. 0.83870968 1.
|
|
0.81818182 0.87096774 0.83333333 0.93333333]
|
|
|
|
mean value: 0.8989928203053899
|
|
|
|
key: train_jcc
|
|
value: [0.95402299 0.96153846 0.95 0.96124031 0.96911197 0.95769231
|
|
0.95785441 0.9688716 0.95752896 0.95769231]
|
|
|
|
mean value: 0.9595553303608277
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.30773807 0.29322624 0.32750988 0.32462645 0.31639314 0.32397628
|
|
0.31056094 0.29872632 0.36270189 0.40175366]
|
|
|
|
mean value: 0.3267212867736816
|
|
|
|
key: score_time
|
|
value: [0.01916409 0.01916027 0.01916218 0.01910424 0.01910162 0.01980019
|
|
0.01962781 0.02352262 0.0193069 0.01963854]
|
|
|
|
mean value: 0.019758844375610353
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.8615634 0.8953202 1. 0.75047877 1.
|
|
0.79385662 0.85933785 0.82195294 0.93094934]
|
|
|
|
mean value: 0.8845642321659994
|
|
|
|
key: train_mcc
|
|
value: [0.95269145 0.9605814 0.94872553 0.96055211 0.9645744 0.95670033
|
|
0.95675965 0.96853396 0.95670033 0.95670033]
|
|
|
|
mean value: 0.9582519495785499
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.92982456 0.94736842 1. 0.875 1.
|
|
0.89285714 0.92857143 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9413533834586466
|
|
|
|
key: train_accuracy
|
|
value: [0.97633136 0.98027613 0.97435897 0.98027613 0.98228346 0.97834646
|
|
0.97834646 0.98425197 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9791163863392816
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.92592593 0.94736842 1. 0.87719298 1.
|
|
0.9 0.93103448 0.90909091 0.96551724]
|
|
|
|
mean value: 0.9421647204042848
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:148: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:151: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.97647059 0.98039216 0.97435897 0.98023715 0.98231827 0.978389
|
|
0.97847358 0.98418972 0.97830375 0.978389 ]
|
|
|
|
mean value: 0.9791522192865763
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.96153846 0.96428571 1. 0.86206897 1.
|
|
0.84375 0.9 0.92592593 0.93333333]
|
|
|
|
mean value: 0.932423573393401
|
|
|
|
key: train_precision
|
|
value: [0.97265625 0.9765625 0.97244094 0.98023715 0.98039216 0.97647059
|
|
0.97276265 0.98809524 0.98023715 0.97647059]
|
|
|
|
mean value: 0.9776325220525254
|
|
|
|
key: test_recall
|
|
value: [1. 0.89285714 0.93103448 1. 0.89285714 1.
|
|
0.96428571 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9538177339901478
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98425197 0.97628458 0.98023715 0.98425197 0.98031496
|
|
0.98425197 0.98031496 0.97637795 0.98031496]
|
|
|
|
mean value: 0.9806915439917837
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.92918719 0.9476601 1. 0.875 1.
|
|
0.89285714 0.92857143 0.91071429 0.96428571]
|
|
|
|
mean value: 0.9413793103448276
|
|
|
|
key: train_roc_auc
|
|
value: [0.97632349 0.98026828 0.97436276 0.98027606 0.98228346 0.97834646
|
|
0.97834646 0.98425197 0.97834646 0.97834646]
|
|
|
|
mean value: 0.979115184712583
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.86206897 0.9 1. 0.78125 1.
|
|
0.81818182 0.87096774 0.83333333 0.93333333]
|
|
|
|
mean value: 0.8932468525634544
|
|
|
|
key: train_jcc
|
|
value: [0.95402299 0.96153846 0.95 0.96124031 0.96525097 0.95769231
|
|
0.95785441 0.9688716 0.95752896 0.95769231]
|
|
|
|
mean value: 0.9591692299747274
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02975988 0.03407001 0.0321157 0.04390645 0.05316114 0.07947993
|
|
0.02710176 0.03776312 0.04052353 0.02836466]
|
|
|
|
mean value: 0.04062461853027344
|
|
|
|
key: score_time
|
|
value: [0.01219678 0.01215196 0.01284242 0.01259923 0.01233125 0.01213431
|
|
0.01271582 0.01214218 0.01203442 0.0119319 ]
|
|
|
|
mean value: 0.012308025360107422
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 1. 0.6000992 0.66143783 0.87287156 0.87287156
|
|
0.32732684 0.66143783 0.32732684 0.53452248]
|
|
|
|
mean value: 0.6487834918450751
|
|
|
|
key: train_mcc
|
|
value: [0.8979331 0.88273483 0.89791134 0.86948194 0.8687127 0.8687127
|
|
0.91277477 0.91240409 0.8687127 0.91277477]
|
|
|
|
mean value: 0.8892152945204359
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 1. 0.8 0.8 0.93333333 0.93333333
|
|
0.66666667 0.8 0.66666667 0.73333333]
|
|
|
|
mean value: 0.8145833333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.94852941 0.94117647 0.94890511 0.93430657 0.93430657 0.93430657
|
|
0.95620438 0.95620438 0.93430657 0.95620438]
|
|
|
|
mean value: 0.9444450407900387
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 1. 0.76923077 0.82352941 0.92307692 0.92307692
|
|
0.70588235 0.76923077 0.70588235 0.8 ]
|
|
|
|
mean value: 0.8243438914027149
|
|
|
|
key: train_fscore
|
|
value: [0.94736842 0.94029851 0.94890511 0.93333333 0.93430657 0.93430657
|
|
0.95522388 0.95588235 0.93430657 0.95522388]
|
|
|
|
mean value: 0.9439155193502107
|
|
|
|
key: test_precision
|
|
value: [0.77777778 1. 0.83333333 0.7 1. 1.
|
|
0.66666667 1. 0.66666667 0.66666667]
|
|
|
|
mean value: 0.8311111111111111
|
|
|
|
key: train_precision
|
|
value: [0.96923077 0.95454545 0.95588235 0.95454545 0.94117647 0.94117647
|
|
0.96969697 0.95588235 0.92753623 0.96969697]
|
|
|
|
mean value: 0.95393694966585
|
|
|
|
key: test_recall
|
|
value: [0.875 1. 0.71428571 1. 0.85714286 0.85714286
|
|
0.75 0.625 0.75 1. ]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.92647059 0.92647059 0.94202899 0.91304348 0.92753623 0.92753623
|
|
0.94117647 0.95588235 0.94117647 0.94117647]
|
|
|
|
mean value: 0.9342497868712702
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 1. 0.79464286 0.8125 0.92857143 0.92857143
|
|
0.66071429 0.8125 0.66071429 0.71428571]
|
|
|
|
mean value: 0.8125
|
|
|
|
key: train_roc_auc
|
|
value: [0.94852941 0.94117647 0.94895567 0.93446292 0.93435635 0.93435635
|
|
0.95609548 0.95620205 0.93435635 0.95609548]
|
|
|
|
mean value: 0.944458653026428
|
|
|
|
key: test_jcc
|
|
value: [0.7 1. 0.625 0.7 0.85714286 0.85714286
|
|
0.54545455 0.625 0.54545455 0.66666667]
|
|
|
|
mean value: 0.7121861471861471
|
|
|
|
key: train_jcc
|
|
value: [0.9 0.88732394 0.90277778 0.875 0.87671233 0.87671233
|
|
0.91428571 0.91549296 0.87671233 0.91428571]
|
|
|
|
mean value: 0.8939303094059027
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.8173337 1.18391466 0.87638569 1.19149208 0.98477817 0.98145533
|
|
1.09080052 0.8652916 0.64874887 0.81251073]
|
|
|
|
mean value: 0.9452711343765259
|
|
|
|
key: score_time
|
|
value: [0.01361656 0.01263309 0.01250887 0.0123415 0.0226686 0.01354671
|
|
0.01257586 0.01254773 0.01364255 0.01389027]
|
|
|
|
mean value: 0.013997173309326172
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 1. 0.6000992 0.76376262 0.87287156 0.73214286
|
|
0.32732684 0.66143783 0.73214286 0.53452248]
|
|
|
|
mean value: 0.6854247024498333
|
|
|
|
key: train_mcc
|
|
value: [0.94158382 0.94117647 0.95630861 0.97122151 0.91240409 1.
|
|
0.8251228 0.95629932 1. 0.98550418]
|
|
|
|
mean value: 0.9489620795067616
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 1. 0.8 0.86666667 0.93333333 0.86666667
|
|
0.66666667 0.8 0.86666667 0.73333333]
|
|
|
|
mean value: 0.8345833333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.97058824 0.97058824 0.97810219 0.98540146 0.95620438 1.
|
|
0.91240876 0.97810219 1. 0.99270073]
|
|
|
|
mean value: 0.9744096178617433
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 1. 0.76923077 0.875 0.92307692 0.85714286
|
|
0.70588235 0.76923077 0.875 0.8 ]
|
|
|
|
mean value: 0.8398093083387201
|
|
|
|
key: train_fscore
|
|
value: [0.97014925 0.97058824 0.97810219 0.98529412 0.95652174 1.
|
|
0.91044776 0.97777778 1. 0.99259259]
|
|
|
|
mean value: 0.9741473667148377
|
|
|
|
key: test_precision
|
|
value: [0.77777778 1. 0.83333333 0.77777778 1. 0.85714286
|
|
0.66666667 1. 0.875 0.66666667]
|
|
|
|
mean value: 0.8454365079365079
|
|
|
|
key: train_precision
|
|
value: [0.98484848 0.97058824 0.98529412 1. 0.95652174 1.
|
|
0.92424242 0.98507463 1. 1. ]
|
|
|
|
mean value: 0.9806569628028192
|
|
|
|
key: test_recall
|
|
value: [0.875 1. 0.71428571 1. 0.85714286 0.85714286
|
|
0.75 0.625 0.875 1. ]
|
|
|
|
mean value: 0.8553571428571428
|
|
|
|
key: train_recall
|
|
value: [0.95588235 0.97058824 0.97101449 0.97101449 0.95652174 1.
|
|
0.89705882 0.97058824 1. 0.98529412]
|
|
|
|
mean value: 0.9677962489343563
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 1. 0.79464286 0.875 0.92857143 0.86607143
|
|
0.66071429 0.8125 0.86607143 0.71428571]
|
|
|
|
mean value: 0.8330357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.97058824 0.97058824 0.97815431 0.98550725 0.95620205 1.
|
|
0.91229753 0.97804774 1. 0.99264706]
|
|
|
|
mean value: 0.9744032395566923
|
|
|
|
key: test_jcc
|
|
value: [0.7 1. 0.625 0.77777778 0.85714286 0.75
|
|
0.54545455 0.625 0.77777778 0.66666667]
|
|
|
|
mean value: 0.7324819624819625
|
|
|
|
key: train_jcc
|
|
value: [0.94202899 0.94285714 0.95714286 0.97101449 0.91666667 1.
|
|
0.83561644 0.95652174 1. 0.98529412]
|
|
|
|
mean value: 0.9507142440061194
|
|
|
|
MCC on Blind test: 0.73
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01516128 0.01172233 0.00925016 0.00884438 0.00867152 0.00968885
|
|
0.00902534 0.0096035 0.00953603 0.00910902]
|
|
|
|
mean value: 0.010061240196228028
|
|
|
|
key: score_time
|
|
value: [0.01382613 0.00929689 0.00917935 0.00864553 0.0086813 0.00937629
|
|
0.00872087 0.00905013 0.00939441 0.00907898]
|
|
|
|
mean value: 0.009524989128112792
|
|
|
|
key: test_mcc
|
|
value: [0.40451992 0.67419986 0.34247476 0.47245559 0.41931393 0.34247476
|
|
0.09449112 0.49099025 0.33928571 0.21821789]
|
|
|
|
mean value: 0.3798423801118058
|
|
|
|
key: train_mcc
|
|
value: [0.57294631 0.54111596 0.60664573 0.55776902 0.59674775 0.57705251
|
|
0.64235303 0.72621377 0.73836136 0.5626648 ]
|
|
|
|
mean value: 0.6121870240894057
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.8125 0.66666667 0.73333333 0.66666667 0.66666667
|
|
0.53333333 0.73333333 0.66666667 0.6 ]
|
|
|
|
mean value: 0.6766666666666666
|
|
|
|
key: train_accuracy
|
|
value: [0.76470588 0.75 0.78832117 0.75912409 0.77372263 0.76642336
|
|
0.81021898 0.86131387 0.86861314 0.75912409]
|
|
|
|
mean value: 0.7901567196221554
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.76923077 0.54545455 0.66666667 0.44444444 0.54545455
|
|
0.46153846 0.71428571 0.66666667 0.57142857]
|
|
|
|
mean value: 0.6000555000555
|
|
|
|
key: train_fscore
|
|
value: [0.70909091 0.69090909 0.75213675 0.7079646 0.72072072 0.71428571
|
|
0.77966102 0.85271318 0.86363636 0.69724771]
|
|
|
|
mean value: 0.7488366054215206
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 0.75 0.8 1. 0.75
|
|
0.6 0.83333333 0.71428571 0.66666667]
|
|
|
|
mean value: 0.7914285714285715
|
|
|
|
key: train_precision
|
|
value: [0.92857143 0.9047619 0.91666667 0.90909091 0.95238095 0.93023256
|
|
0.92 0.90163934 0.890625 0.92682927]
|
|
|
|
mean value: 0.9180798032166374
|
|
|
|
key: test_recall
|
|
value: [0.5 0.625 0.42857143 0.57142857 0.28571429 0.42857143
|
|
0.375 0.625 0.625 0.5 ]
|
|
|
|
mean value: 0.49642857142857144
|
|
|
|
key: train_recall
|
|
value: [0.57352941 0.55882353 0.63768116 0.57971014 0.57971014 0.57971014
|
|
0.67647059 0.80882353 0.83823529 0.55882353]
|
|
|
|
mean value: 0.639151747655584
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.8125 0.65178571 0.72321429 0.64285714 0.65178571
|
|
0.54464286 0.74107143 0.66964286 0.60714286]
|
|
|
|
mean value: 0.6732142857142858
|
|
|
|
key: train_roc_auc
|
|
value: [0.76470588 0.75 0.78942882 0.76044331 0.77514919 0.76779625
|
|
0.80924979 0.8609335 0.86839301 0.75767263]
|
|
|
|
mean value: 0.7903772378516624
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.625 0.375 0.5 0.28571429 0.375
|
|
0.3 0.55555556 0.5 0.4 ]
|
|
|
|
mean value: 0.43607142857142855
|
|
|
|
key: train_jcc
|
|
value: [0.54929577 0.52777778 0.60273973 0.54794521 0.56338028 0.55555556
|
|
0.63888889 0.74324324 0.76 0.53521127]
|
|
|
|
mean value: 0.6024037720915977
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00991607 0.00885415 0.0101788 0.01003098 0.00886106 0.00952673
|
|
0.00979686 0.010741 0.01050067 0.01020956]
|
|
|
|
mean value: 0.009861588478088379
|
|
|
|
key: score_time
|
|
value: [0.00879622 0.0093503 0.00995898 0.00946927 0.00864244 0.00944114
|
|
0.00965357 0.01037049 0.00998425 0.00960541]
|
|
|
|
mean value: 0.009527206420898438
|
|
|
|
key: test_mcc
|
|
value: [ 0.40451992 0.67419986 0.46428571 0.75592895 0.6000992 0.46428571
|
|
-0.19642857 0.37796447 0.47245559 0.32732684]
|
|
|
|
mean value: 0.43446376808762277
|
|
|
|
key: train_mcc
|
|
value: [0.67911938 0.60616144 0.68986702 0.63862773 0.65701381 0.57996733
|
|
0.66746486 0.62437433 0.66581484 0.640228 ]
|
|
|
|
mean value: 0.6448638739267128
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.8125 0.73333333 0.86666667 0.8 0.73333333
|
|
0.4 0.66666667 0.73333333 0.66666667]
|
|
|
|
mean value: 0.71
|
|
|
|
key: train_accuracy
|
|
value: [0.83823529 0.80147059 0.83941606 0.81751825 0.82481752 0.78832117
|
|
0.83211679 0.81021898 0.83211679 0.81751825]
|
|
|
|
mean value: 0.8201749677973379
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.76923077 0.71428571 0.83333333 0.76923077 0.71428571
|
|
0.4 0.61538462 0.77777778 0.70588235]
|
|
|
|
mean value: 0.6914795661854485
|
|
|
|
key: train_fscore
|
|
value: [0.83076923 0.79069767 0.82539683 0.80916031 0.8125 0.77862595
|
|
0.82170543 0.796875 0.82442748 0.80314961]
|
|
|
|
mean value: 0.8093307503698478
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 0.71428571 1. 0.83333333 0.71428571
|
|
0.42857143 0.8 0.7 0.66666667]
|
|
|
|
mean value: 0.7657142857142857
|
|
|
|
key: train_precision
|
|
value: [0.87096774 0.83606557 0.9122807 0.85483871 0.88135593 0.82258065
|
|
0.86885246 0.85 0.85714286 0.86440678]
|
|
|
|
mean value: 0.8618491400322729
|
|
|
|
key: test_recall
|
|
value: [0.5 0.625 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.375 0.5 0.875 0.75 ]
|
|
|
|
mean value: 0.6482142857142857
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.75 0.75362319 0.76811594 0.75362319 0.73913043
|
|
0.77941176 0.75 0.79411765 0.75 ]
|
|
|
|
mean value: 0.7632139812446718
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.8125 0.73214286 0.85714286 0.79464286 0.73214286
|
|
0.40178571 0.67857143 0.72321429 0.66071429]
|
|
|
|
mean value: 0.7080357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.83823529 0.80147059 0.84004689 0.8178815 0.82534101 0.78868286
|
|
0.83173487 0.80978261 0.83184143 0.81702899]
|
|
|
|
mean value: 0.8202046035805627
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.625 0.55555556 0.71428571 0.625 0.55555556
|
|
0.25 0.44444444 0.63636364 0.54545455]
|
|
|
|
mean value: 0.5396103896103897
|
|
|
|
key: train_jcc
|
|
value: [0.71052632 0.65384615 0.7027027 0.67948718 0.68421053 0.6375
|
|
0.69736842 0.66233766 0.7012987 0.67105263]
|
|
|
|
mean value: 0.6800330294409241
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00962687 0.0089643 0.00940514 0.00952768 0.00951171 0.00911784
|
|
0.00948524 0.00956416 0.01086783 0.00937128]
|
|
|
|
mean value: 0.00954420566558838
|
|
|
|
key: score_time
|
|
value: [0.01055789 0.01027226 0.01053548 0.010499 0.01059937 0.00998354
|
|
0.01044297 0.01039433 0.01533222 0.01476574]
|
|
|
|
mean value: 0.011338281631469726
|
|
|
|
key: test_mcc
|
|
value: [ 0.25 0.12598816 0.32732684 0.60714286 0.64465837 0.73214286
|
|
-0.07142857 0.33928571 0.21821789 0.75592895]
|
|
|
|
mean value: 0.3929263057641339
|
|
|
|
key: train_mcc
|
|
value: [0.62196632 0.60300638 0.6647466 0.59240339 0.60592498 0.5644673
|
|
0.66581484 0.60584099 0.61074523 0.640228 ]
|
|
|
|
mean value: 0.6175144024127495
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.5625 0.66666667 0.8 0.8 0.86666667
|
|
0.46666667 0.66666667 0.6 0.86666667]
|
|
|
|
mean value: 0.6920833333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.80882353 0.80147059 0.83211679 0.79562044 0.80291971 0.7810219
|
|
0.83211679 0.80291971 0.80291971 0.81751825]
|
|
|
|
mean value: 0.8077447402318592
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.58823529 0.61538462 0.8 0.72727273 0.85714286
|
|
0.5 0.66666667 0.57142857 0.88888889]
|
|
|
|
mean value: 0.6840019620901974
|
|
|
|
key: train_fscore
|
|
value: [0.796875 0.8 0.83687943 0.79104478 0.80291971 0.77272727
|
|
0.82442748 0.8 0.78740157 0.80314961]
|
|
|
|
mean value: 0.8015424851518379
|
|
|
|
key: test_precision
|
|
value: [0.625 0.55555556 0.66666667 0.75 1. 0.85714286
|
|
0.5 0.71428571 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7135317460317461
|
|
|
|
key: train_precision
|
|
value: [0.85 0.80597015 0.81944444 0.81538462 0.80882353 0.80952381
|
|
0.85714286 0.80597015 0.84745763 0.86440678]
|
|
|
|
mean value: 0.8284123961194615
|
|
|
|
key: test_recall
|
|
value: [0.625 0.625 0.57142857 0.85714286 0.57142857 0.85714286
|
|
0.5 0.625 0.5 1. ]
|
|
|
|
mean value: 0.6732142857142857
|
|
|
|
key: train_recall
|
|
value: [0.75 0.79411765 0.85507246 0.76811594 0.79710145 0.73913043
|
|
0.79411765 0.79411765 0.73529412 0.75 ]
|
|
|
|
mean value: 0.7777067348678601
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.5625 0.66071429 0.80357143 0.78571429 0.86607143
|
|
0.46428571 0.66964286 0.60714286 0.85714286]
|
|
|
|
mean value: 0.6901785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.80882353 0.80147059 0.831948 0.79582268 0.80296249 0.78132992
|
|
0.83184143 0.80285592 0.80242967 0.81702899]
|
|
|
|
mean value: 0.8076513213981245
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.41666667 0.44444444 0.66666667 0.57142857 0.75
|
|
0.33333333 0.5 0.4 0.8 ]
|
|
|
|
mean value: 0.5337085137085137
|
|
|
|
key: train_jcc
|
|
value: [0.66233766 0.66666667 0.7195122 0.65432099 0.67073171 0.62962963
|
|
0.7012987 0.66666667 0.64935065 0.67105263]
|
|
|
|
mean value: 0.6691567497622268
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.64
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01161861 0.01145601 0.01143718 0.01146555 0.01147413 0.01156044
|
|
0.01610374 0.01574993 0.01282072 0.01037717]
|
|
|
|
mean value: 0.012406349182128906
|
|
|
|
key: score_time
|
|
value: [0.01001382 0.01000714 0.00987005 0.00989294 0.01008201 0.00988507
|
|
0.01483679 0.01519465 0.01058769 0.00996661]
|
|
|
|
mean value: 0.01103367805480957
|
|
|
|
key: test_mcc
|
|
value: [0.75 0.5 0.6000992 0.56407607 0.6000992 0.87287156
|
|
0.18898224 0.66143783 0.18898224 0.53452248]
|
|
|
|
mean value: 0.5461070816659918
|
|
|
|
key: train_mcc
|
|
value: [0.83905224 0.76503685 0.83951407 0.79590547 0.83951407 0.79599234
|
|
0.85434012 0.83951407 0.8251228 0.81031543]
|
|
|
|
mean value: 0.8204307458510867
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.75 0.8 0.73333333 0.8 0.93333333
|
|
0.6 0.8 0.6 0.73333333]
|
|
|
|
mean value: 0.7625
|
|
|
|
key: train_accuracy
|
|
value: [0.91911765 0.88235294 0.91970803 0.89781022 0.91970803 0.89781022
|
|
0.9270073 0.91970803 0.91240876 0.90510949]
|
|
|
|
mean value: 0.9100740661227995
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.75 0.76923077 0.77777778 0.76923077 0.92307692
|
|
0.66666667 0.76923077 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7766880341880341
|
|
|
|
key: train_fscore
|
|
value: [0.91729323 0.88405797 0.91970803 0.9 0.91970803 0.89705882
|
|
0.92537313 0.91970803 0.91044776 0.90510949]
|
|
|
|
mean value: 0.9098464499791336
|
|
|
|
key: test_precision
|
|
value: [0.875 0.75 0.83333333 0.63636364 0.83333333 1.
|
|
0.6 1. 0.6 0.66666667]
|
|
|
|
mean value: 0.7794696969696969
|
|
|
|
key: train_precision
|
|
value: [0.93846154 0.87142857 0.92647059 0.88732394 0.92647059 0.91044776
|
|
0.93939394 0.91304348 0.92424242 0.89855072]
|
|
|
|
mean value: 0.9135833557751614
|
|
|
|
key: test_recall
|
|
value: [0.875 0.75 0.71428571 1. 0.71428571 0.85714286
|
|
0.75 0.625 0.75 1. ]
|
|
|
|
mean value: 0.8035714285714286
|
|
|
|
key: train_recall
|
|
value: [0.89705882 0.89705882 0.91304348 0.91304348 0.91304348 0.88405797
|
|
0.91176471 0.92647059 0.89705882 0.91176471]
|
|
|
|
mean value: 0.9064364876385337
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.75 0.79464286 0.75 0.79464286 0.92857143
|
|
0.58928571 0.8125 0.58928571 0.71428571]
|
|
|
|
mean value: 0.7598214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [0.91911765 0.88235294 0.91975703 0.89769821 0.91975703 0.89791134
|
|
0.92689685 0.91975703 0.91229753 0.90515772]
|
|
|
|
mean value: 0.9100703324808184
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.6 0.625 0.63636364 0.625 0.85714286
|
|
0.5 0.625 0.5 0.66666667]
|
|
|
|
mean value: 0.6412950937950938
|
|
|
|
key: train_jcc
|
|
value: [0.84722222 0.79220779 0.85135135 0.81818182 0.85135135 0.81333333
|
|
0.86111111 0.85135135 0.83561644 0.82666667]
|
|
|
|
mean value: 0.8348393436133162
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.70389104 0.67864847 1.09095931 0.99975729 0.98860121 0.84449649
|
|
0.65940809 0.66229033 0.58026719 0.63522124]
|
|
|
|
mean value: 0.7843540668487549
|
|
|
|
key: score_time
|
|
value: [0.01230264 0.01353049 0.01278973 0.01525879 0.0178535 0.0141089
|
|
0.01234412 0.01248693 0.01246476 0.01245332]
|
|
|
|
mean value: 0.013559317588806153
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.75 0.32732684 0.46770717 0.6000992 0.32732684
|
|
0.47245559 0.66143783 0.46428571 0.53452248]
|
|
|
|
mean value: 0.4983126132351171
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.875 0.66666667 0.66666667 0.8 0.66666667
|
|
0.73333333 0.8 0.73333333 0.73333333]
|
|
|
|
mean value: 0.73625
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.875 0.61538462 0.73684211 0.76923077 0.61538462
|
|
0.77777778 0.76923077 0.75 0.8 ]
|
|
|
|
mean value: 0.7414733005212881
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.875 0.66666667 0.58333333 0.83333333 0.66666667
|
|
0.7 1. 0.75 0.66666667]
|
|
|
|
mean value: 0.7408333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.875 0.57142857 1. 0.71428571 0.57142857
|
|
0.875 0.625 0.75 1. ]
|
|
|
|
mean value: 0.7732142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.875 0.66071429 0.6875 0.79464286 0.66071429
|
|
0.72321429 0.8125 0.73214286 0.71428571]
|
|
|
|
mean value: 0.7348214285714285
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.77777778 0.44444444 0.58333333 0.625 0.44444444
|
|
0.63636364 0.625 0.6 0.66666667]
|
|
|
|
mean value: 0.5948484848484848
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.5
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01449776 0.01498055 0.01225662 0.01179147 0.01136565 0.01139188
|
|
0.0118866 0.01129723 0.01635385 0.01810718]
|
|
|
|
mean value: 0.013392877578735352
|
|
|
|
key: score_time
|
|
value: [0.01178002 0.01000428 0.00914884 0.00882721 0.00953579 0.00952053
|
|
0.00946212 0.00917983 0.01388979 0.0131166 ]
|
|
|
|
mean value: 0.010446500778198243
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.8819171 0.875 1. 1. 0.87287156
|
|
0.75592895 0.87287156 0.875 1. ]
|
|
|
|
mean value: 0.8908185840836074
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.9375 0.93333333 1. 1. 0.93333333
|
|
0.86666667 0.93333333 0.93333333 1. ]
|
|
|
|
mean value: 0.94125
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.93333333 0.93333333 1. 1. 0.92307692
|
|
0.88888889 0.94117647 0.93333333 1. ]
|
|
|
|
mean value: 0.9410285139696904
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.875 1. 1. 1.
|
|
0.8 0.88888889 1. 1. ]
|
|
|
|
mean value: 0.9563888888888888
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.875 1. 1. 1. 0.85714286
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.9375 0.9375 1. 1. 0.92857143
|
|
0.85714286 0.92857143 0.9375 1. ]
|
|
|
|
mean value: 0.9401785714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.875 0.875 1. 1. 0.85714286
|
|
0.8 0.88888889 0.875 1. ]
|
|
|
|
mean value: 0.8921031746031746
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09935236 0.10409498 0.12266827 0.1061933 0.09845328 0.10499287
|
|
0.09967995 0.09435582 0.08926463 0.09026933]
|
|
|
|
mean value: 0.10093247890472412
|
|
|
|
key: score_time
|
|
value: [0.02710509 0.01923895 0.01924515 0.01959372 0.01917553 0.02634406
|
|
0.01896572 0.01882887 0.01784062 0.01771426]
|
|
|
|
mean value: 0.02040519714355469
|
|
|
|
key: test_mcc
|
|
value: [0.5 0.62994079 0.6000992 0.49099025 0.87287156 0.73214286
|
|
0.19642857 0.66143783 0.47245559 0.53452248]
|
|
|
|
mean value: 0.5690889131896603
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.8125 0.8 0.73333333 0.93333333 0.86666667
|
|
0.6 0.8 0.73333333 0.73333333]
|
|
|
|
mean value: 0.77625
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.82352941 0.76923077 0.75 0.92307692 0.85714286
|
|
0.625 0.76923077 0.77777778 0.8 ]
|
|
|
|
mean value: 0.7844988508223802
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.77777778 0.83333333 0.66666667 1. 0.85714286
|
|
0.625 1. 0.7 0.66666667]
|
|
|
|
mean value: 0.7876587301587301
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.875 0.71428571 0.85714286 0.85714286 0.85714286
|
|
0.625 0.625 0.875 1. ]
|
|
|
|
mean value: 0.8035714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.8125 0.79464286 0.74107143 0.92857143 0.86607143
|
|
0.59821429 0.8125 0.72321429 0.71428571]
|
|
|
|
mean value: 0.7741071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.7 0.625 0.6 0.85714286 0.75
|
|
0.45454545 0.625 0.63636364 0.66666667]
|
|
|
|
mean value: 0.6514718614718614
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.73
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01000309 0.0091064 0.00996566 0.00957346 0.01001763 0.01011038
|
|
0.00918674 0.009022 0.00903392 0.00947905]
|
|
|
|
mean value: 0.009549832344055176
|
|
|
|
key: score_time
|
|
value: [0.00933886 0.00877929 0.00922394 0.00939322 0.00928879 0.00938463
|
|
0.00908303 0.00888348 0.00878763 0.0089674 ]
|
|
|
|
mean value: 0.009113025665283204
|
|
|
|
key: test_mcc
|
|
value: [ 0.13483997 0.62994079 0.19642857 0.46428571 0.34247476 0.46428571
|
|
0.05455447 -0.19642857 0.19642857 0.20044593]
|
|
|
|
mean value: 0.24872559245454634
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.5625 0.8125 0.6 0.73333333 0.66666667 0.73333333
|
|
0.53333333 0.4 0.6 0.6 ]
|
|
|
|
mean value: 0.6241666666666666
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.63157895 0.82352941 0.57142857 0.71428571 0.54545455 0.71428571
|
|
0.58823529 0.4 0.625 0.7 ]
|
|
|
|
mean value: 0.6313798198705319
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.54545455 0.77777778 0.57142857 0.71428571 0.75 0.71428571
|
|
0.55555556 0.42857143 0.625 0.58333333]
|
|
|
|
mean value: 0.626569264069264
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.875 0.57142857 0.71428571 0.42857143 0.71428571
|
|
0.625 0.375 0.625 0.875 ]
|
|
|
|
mean value: 0.6553571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5625 0.8125 0.59821429 0.73214286 0.65178571 0.73214286
|
|
0.52678571 0.40178571 0.59821429 0.58035714]
|
|
|
|
mean value: 0.6196428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.46153846 0.7 0.4 0.55555556 0.375 0.55555556
|
|
0.41666667 0.25 0.45454545 0.53846154]
|
|
|
|
mean value: 0.4707323232323232
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.18459654 1.16294837 1.18245697 1.17819452 1.17918324 1.1352222
|
|
1.12796831 1.13954997 1.12182617 1.11643767]
|
|
|
|
mean value: 1.152838397026062
|
|
|
|
key: score_time
|
|
value: [0.09779596 0.09762621 0.09741259 0.09748626 0.08932686 0.0909512
|
|
0.09467626 0.09128833 0.15364361 0.09373999]
|
|
|
|
mean value: 0.10039472579956055
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.73214286 0.87287156 0.87287156 0.875
|
|
0.64465837 0.875 0.73214286 1. ]
|
|
|
|
mean value: 0.837928387663544
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.86666667 0.93333333 0.93333333 0.93333333
|
|
0.8 0.93333333 0.86666667 1. ]
|
|
|
|
mean value: 0.9141666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 1. 0.85714286 0.92307692 0.92307692 0.93333333
|
|
0.84210526 0.93333333 0.875 1. ]
|
|
|
|
mean value: 0.9144211490264121
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.85714286 1. 1. 0.875
|
|
0.72727273 1. 0.875 1. ]
|
|
|
|
mean value: 0.9334415584415584
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 0.85714286 0.85714286 0.85714286 1.
|
|
1. 0.875 0.875 1. ]
|
|
|
|
mean value: 0.9071428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.86607143 0.92857143 0.92857143 0.9375
|
|
0.78571429 0.9375 0.86607143 1. ]
|
|
|
|
mean value: 0.9125
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 1. 0.75 0.85714286 0.85714286 0.875
|
|
0.72727273 0.875 0.77777778 1. ]
|
|
|
|
mean value: 0.8469336219336219
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.79976606 0.85018516 0.8501091 0.89193416 0.93633461 0.90565896
|
|
0.87436795 0.92361069 0.89648938 0.87731433]
|
|
|
|
mean value: 0.8805770397186279
|
|
|
|
key: score_time
|
|
value: [0.23642707 0.20861435 0.2369709 0.24829936 0.22590995 0.23644233
|
|
0.15728378 0.22497702 0.20615268 0.19786143]
|
|
|
|
mean value: 0.2178938865661621
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.8819171 0.73214286 0.87287156 0.87287156 1.
|
|
0.32732684 0.66143783 0.875 0.87287156]
|
|
|
|
mean value: 0.7978356410471296
|
|
|
|
key: train_mcc
|
|
value: [0.97058824 0.95598573 0.95630861 0.95713391 0.97080136 0.97080136
|
|
0.97120941 0.97080136 0.95629932 0.95629932]
|
|
|
|
mean value: 0.9636228629898831
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.9375 0.86666667 0.93333333 0.93333333 1.
|
|
0.66666667 0.8 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8941666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.98529412 0.97794118 0.97810219 0.97810219 0.98540146 0.98540146
|
|
0.98540146 0.98540146 0.97810219 0.97810219]
|
|
|
|
mean value: 0.9817249892657793
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.93333333 0.85714286 0.92307692 0.92307692 1.
|
|
0.70588235 0.76923077 0.93333333 0.94117647]
|
|
|
|
mean value: 0.8919586296056884
|
|
|
|
key: train_fscore
|
|
value: [0.98529412 0.97777778 0.97810219 0.97777778 0.98550725 0.98550725
|
|
0.98507463 0.98529412 0.97777778 0.97777778]
|
|
|
|
mean value: 0.9815890655805546
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.85714286 1. 1. 1.
|
|
0.66666667 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9412698412698413
|
|
|
|
key: train_precision
|
|
value: [0.98529412 0.98507463 0.98529412 1. 0.98550725 0.98550725
|
|
1. 0.98529412 0.98507463 0.98507463]
|
|
|
|
mean value: 0.9882120726291814
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.85714286 0.85714286 0.85714286 1.
|
|
0.75 0.625 0.875 1. ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.98529412 0.97058824 0.97101449 0.95652174 0.98550725 0.98550725
|
|
0.97058824 0.98529412 0.97058824 0.97058824]
|
|
|
|
mean value: 0.9751491901108269
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.9375 0.86607143 0.92857143 0.92857143 1.
|
|
0.66071429 0.8125 0.9375 0.92857143]
|
|
|
|
mean value: 0.89375
|
|
|
|
key: train_roc_auc
|
|
value: [0.98529412 0.97794118 0.97815431 0.97826087 0.98540068 0.98540068
|
|
0.98529412 0.98540068 0.97804774 0.97804774]
|
|
|
|
mean value: 0.9817242114237
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.875 0.75 0.85714286 0.85714286 1.
|
|
0.54545455 0.625 0.875 0.88888889]
|
|
|
|
mean value: 0.8148629148629148
|
|
|
|
key: train_jcc
|
|
value: [0.97101449 0.95652174 0.95714286 0.95652174 0.97142857 0.97142857
|
|
0.97058824 0.97101449 0.95652174 0.95652174]
|
|
|
|
mean value: 0.9638704177323103
|
|
|
|
MCC on Blind test: 0.85
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00971198 0.0104022 0.00985646 0.00999689 0.01531601 0.0123229
|
|
0.00914288 0.00894403 0.00899482 0.01425886]
|
|
|
|
mean value: 0.010894703865051269
|
|
|
|
key: score_time
|
|
value: [0.00989151 0.00964308 0.00926447 0.01156521 0.01512861 0.01090312
|
|
0.00868583 0.00870919 0.0092032 0.01466846]
|
|
|
|
mean value: 0.010766267776489258
|
|
|
|
key: test_mcc
|
|
value: [ 0.40451992 0.67419986 0.46428571 0.75592895 0.6000992 0.46428571
|
|
-0.19642857 0.37796447 0.47245559 0.32732684]
|
|
|
|
mean value: 0.43446376808762277
|
|
|
|
key: train_mcc
|
|
value: [0.67911938 0.60616144 0.68986702 0.63862773 0.65701381 0.57996733
|
|
0.66746486 0.62437433 0.66581484 0.640228 ]
|
|
|
|
mean value: 0.6448638739267128
|
|
|
|
key: test_accuracy
|
|
value: [0.6875 0.8125 0.73333333 0.86666667 0.8 0.73333333
|
|
0.4 0.66666667 0.73333333 0.66666667]
|
|
|
|
mean value: 0.71
|
|
|
|
key: train_accuracy
|
|
value: [0.83823529 0.80147059 0.83941606 0.81751825 0.82481752 0.78832117
|
|
0.83211679 0.81021898 0.83211679 0.81751825]
|
|
|
|
mean value: 0.8201749677973379
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.76923077 0.71428571 0.83333333 0.76923077 0.71428571
|
|
0.4 0.61538462 0.77777778 0.70588235]
|
|
|
|
mean value: 0.6914795661854485
|
|
|
|
key: train_fscore
|
|
value: [0.83076923 0.79069767 0.82539683 0.80916031 0.8125 0.77862595
|
|
0.82170543 0.796875 0.82442748 0.80314961]
|
|
|
|
mean value: 0.8093307503698478
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 0.71428571 1. 0.83333333 0.71428571
|
|
0.42857143 0.8 0.7 0.66666667]
|
|
|
|
mean value: 0.7657142857142857
|
|
|
|
key: train_precision
|
|
value: [0.87096774 0.83606557 0.9122807 0.85483871 0.88135593 0.82258065
|
|
0.86885246 0.85 0.85714286 0.86440678]
|
|
|
|
mean value: 0.8618491400322729
|
|
|
|
key: test_recall
|
|
value: [0.5 0.625 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.375 0.5 0.875 0.75 ]
|
|
|
|
mean value: 0.6482142857142857
|
|
|
|
key: train_recall
|
|
value: [0.79411765 0.75 0.75362319 0.76811594 0.75362319 0.73913043
|
|
0.77941176 0.75 0.79411765 0.75 ]
|
|
|
|
mean value: 0.7632139812446718
|
|
|
|
key: test_roc_auc
|
|
value: [0.6875 0.8125 0.73214286 0.85714286 0.79464286 0.73214286
|
|
0.40178571 0.67857143 0.72321429 0.66071429]
|
|
|
|
mean value: 0.7080357142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.83823529 0.80147059 0.84004689 0.8178815 0.82534101 0.78868286
|
|
0.83173487 0.80978261 0.83184143 0.81702899]
|
|
|
|
mean value: 0.8202046035805627
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.625 0.55555556 0.71428571 0.625 0.55555556
|
|
0.25 0.44444444 0.63636364 0.54545455]
|
|
|
|
mean value: 0.5396103896103897
|
|
|
|
key: train_jcc
|
|
value: [0.71052632 0.65384615 0.7027027 0.67948718 0.68421053 0.6375
|
|
0.69736842 0.66233766 0.7012987 0.67105263]
|
|
|
|
mean value: 0.6800330294409241
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.14137769 0.04036045 0.04341197 0.04366136 0.0674777 0.04775929
|
|
0.04809213 0.04925728 0.04873991 0.04506922]
|
|
|
|
mean value: 0.057520699501037595
|
|
|
|
key: score_time
|
|
value: [0.01150417 0.01063967 0.01087403 0.01018643 0.01102757 0.0106461
|
|
0.01055002 0.01029372 0.01260114 0.01114511]
|
|
|
|
mean value: 0.01094679832458496
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 1. 0.87287156 1. 0.875
|
|
0.87287156 1. 0.875 1. ]
|
|
|
|
mean value: 0.9270339791129423
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 1. 0.93333333 1. 0.93333333
|
|
0.93333333 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9608333333333333
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 1. 1. 0.92307692 1. 0.93333333
|
|
0.94117647 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9588062917474682
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.875
|
|
0.88888889 1. 1. 1. ]
|
|
|
|
mean value: 0.9763888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 1. 0.85714286 1. 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 1. 0.92857143 1. 0.9375
|
|
0.92857143 1. 0.9375 1. ]
|
|
|
|
mean value: 0.9607142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 1. 1. 0.85714286 1. 0.875
|
|
0.88888889 1. 0.875 1. ]
|
|
|
|
mean value: 0.9246031746031746
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0207808 0.02300382 0.0433135 0.05539727 0.05441689 0.05075097
|
|
0.05279303 0.04780746 0.06620932 0.02348566]
|
|
|
|
mean value: 0.04379587173461914
|
|
|
|
key: score_time
|
|
value: [0.01240396 0.02174282 0.02213216 0.02098989 0.0220952 0.01775432
|
|
0.01941085 0.02169156 0.01264215 0.02482295]
|
|
|
|
mean value: 0.019568586349487306
|
|
|
|
key: test_mcc
|
|
value: [-0.12598816 0.12598816 0.47245559 0.32732684 0.04029115 0.46428571
|
|
-0.07142857 0.26189246 0.6000992 0.32732684]
|
|
|
|
mean value: 0.24222492144851507
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98540068 1. 0.97080136 0.97080136 0.98550418
|
|
1. 0.98550725 1. 0.98550418]
|
|
|
|
mean value: 0.9883519009251238
|
|
|
|
key: test_accuracy
|
|
value: [0.4375 0.5625 0.73333333 0.66666667 0.53333333 0.73333333
|
|
0.46666667 0.6 0.8 0.66666667]
|
|
|
|
mean value: 0.62
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99264706 1. 0.98540146 0.98540146 0.99270073
|
|
1. 0.99270073 1. 0.99270073]
|
|
|
|
mean value: 0.9941552168312581
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.58823529 0.66666667 0.61538462 0.36363636 0.71428571
|
|
0.5 0.5 0.82352941 0.70588235]
|
|
|
|
mean value: 0.587762041879689
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99270073 1. 0.98550725 0.98550725 0.99280576
|
|
1. 0.99270073 1. 0.99259259]
|
|
|
|
mean value: 0.9941814300595915
|
|
|
|
key: test_precision
|
|
value: [0.42857143 0.55555556 0.8 0.66666667 0.5 0.71428571
|
|
0.5 0.75 0.77777778 0.66666667]
|
|
|
|
mean value: 0.6359523809523809
|
|
|
|
key: train_precision
|
|
value: [1. 0.98550725 1. 0.98550725 0.98550725 0.98571429
|
|
1. 0.98550725 1. 1. ]
|
|
|
|
mean value: 0.9927743271221532
|
|
|
|
key: test_recall
|
|
value: [0.375 0.625 0.57142857 0.57142857 0.28571429 0.71428571
|
|
0.5 0.375 0.875 0.75 ]
|
|
|
|
mean value: 0.5642857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 0.98550725 0.98550725 1.
|
|
1. 1. 1. 0.98529412]
|
|
|
|
mean value: 0.9956308610400683
|
|
|
|
key: test_roc_auc
|
|
value: [0.4375 0.5625 0.72321429 0.66071429 0.51785714 0.73214286
|
|
0.46428571 0.61607143 0.79464286 0.66071429]
|
|
|
|
mean value: 0.6169642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99264706 1. 0.98540068 0.98540068 0.99264706
|
|
1. 0.99275362 1. 0.99264706]
|
|
|
|
mean value: 0.9941496163682865
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.41666667 0.5 0.44444444 0.22222222 0.55555556
|
|
0.33333333 0.33333333 0.7 0.54545455]
|
|
|
|
mean value: 0.4301010101010101
|
|
|
|
key: train_jcc
|
|
value: [1. 0.98550725 1. 0.97142857 0.97142857 0.98571429
|
|
1. 0.98550725 1. 0.98529412]
|
|
|
|
mean value: 0.988488003897211
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02475095 0.00916529 0.00877476 0.00865769 0.01192904 0.00877118
|
|
0.00866795 0.00887179 0.01024485 0.00899529]
|
|
|
|
mean value: 0.010882878303527832
|
|
|
|
key: score_time
|
|
value: [0.00913358 0.00896764 0.00864172 0.00901842 0.00979638 0.00859523
|
|
0.00856853 0.00883698 0.00979352 0.01023078]
|
|
|
|
mean value: 0.00915827751159668
|
|
|
|
key: test_mcc
|
|
value: [0.67419986 0.75 0.73214286 0.21821789 0.87287156 0.75592895
|
|
0.19642857 0.66143783 0.32732684 0.34247476]
|
|
|
|
mean value: 0.553102911106401
|
|
|
|
key: train_mcc
|
|
value: [0.67676337 0.63406934 0.678815 0.63512361 0.73758262 0.66432225
|
|
0.79590547 0.69352089 0.73721228 0.62060153]
|
|
|
|
mean value: 0.6873916366987387
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.875 0.86666667 0.6 0.93333333 0.86666667
|
|
0.6 0.8 0.66666667 0.66666667]
|
|
|
|
mean value: 0.76875
|
|
|
|
key: train_accuracy
|
|
value: [0.83823529 0.81617647 0.83941606 0.81751825 0.86861314 0.83211679
|
|
0.89781022 0.84671533 0.86861314 0.81021898]
|
|
|
|
mean value: 0.8435433662516101
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.875 0.85714286 0.625 0.92307692 0.83333333
|
|
0.625 0.76923077 0.70588235 0.73684211]
|
|
|
|
mean value: 0.7719739110218986
|
|
|
|
key: train_fscore
|
|
value: [0.84057971 0.82269504 0.84057971 0.81751825 0.86764706 0.83211679
|
|
0.89552239 0.84671533 0.86764706 0.80597015]
|
|
|
|
mean value: 0.8436991475674843
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.85714286 0.55555556 1. 1.
|
|
0.625 1. 0.66666667 0.63636364]
|
|
|
|
mean value: 0.8215728715728716
|
|
|
|
key: train_precision
|
|
value: [0.82857143 0.79452055 0.84057971 0.82352941 0.88059701 0.83823529
|
|
0.90909091 0.84057971 0.86764706 0.81818182]
|
|
|
|
mean value: 0.8441532903710471
|
|
|
|
key: test_recall
|
|
value: [0.625 0.875 0.85714286 0.71428571 0.85714286 0.71428571
|
|
0.625 0.625 0.75 0.875 ]
|
|
|
|
mean value: 0.7517857142857143
|
|
|
|
key: train_recall
|
|
value: [0.85294118 0.85294118 0.84057971 0.8115942 0.85507246 0.82608696
|
|
0.88235294 0.85294118 0.86764706 0.79411765]
|
|
|
|
mean value: 0.8436274509803922
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.875 0.86607143 0.60714286 0.92857143 0.85714286
|
|
0.59821429 0.8125 0.66071429 0.65178571]
|
|
|
|
mean value: 0.7669642857142858
|
|
|
|
key: train_roc_auc
|
|
value: [0.83823529 0.81617647 0.8394075 0.81756181 0.8687127 0.83216113
|
|
0.89769821 0.84676044 0.86860614 0.8101023 ]
|
|
|
|
mean value: 0.8435421994884911
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.77777778 0.75 0.45454545 0.85714286 0.71428571
|
|
0.45454545 0.625 0.54545455 0.58333333]
|
|
|
|
mean value: 0.6387085137085137
|
|
|
|
key: train_jcc
|
|
value: [0.725 0.69879518 0.725 0.69135802 0.76623377 0.7125
|
|
0.81081081 0.73417722 0.76623377 0.675 ]
|
|
|
|
mean value: 0.7305108763882466
|
|
|
|
MCC on Blind test: 0.56
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0145216 0.01501608 0.01437402 0.01481843 0.01543903 0.01425052
|
|
0.01434255 0.01563168 0.014395 0.01469946]
|
|
|
|
mean value: 0.014748835563659668
|
|
|
|
key: score_time
|
|
value: [0.01158118 0.01175547 0.01154518 0.01160622 0.01153302 0.01158404
|
|
0.01175642 0.01160836 0.01161551 0.01179576]
|
|
|
|
mean value: 0.011638116836547852
|
|
|
|
key: test_mcc
|
|
value: [0.8819171 0.67419986 0.75592895 0.73214286 0.73214286 0.53452248
|
|
0.6000992 0.56407607 0.60714286 0.53452248]
|
|
|
|
mean value: 0.6616694724214908
|
|
|
|
key: train_mcc
|
|
value: [0.94280904 0.8623165 0.8130258 0.95713391 0.95629932 0.88938138
|
|
0.92944673 0.81250852 0.88920184 0.9158731 ]
|
|
|
|
mean value: 0.8967996137276029
|
|
|
|
key: test_accuracy
|
|
value: [0.9375 0.8125 0.86666667 0.86666667 0.86666667 0.73333333
|
|
0.8 0.73333333 0.8 0.73333333]
|
|
|
|
mean value: 0.815
|
|
|
|
key: train_accuracy
|
|
value: [0.97058824 0.92647059 0.89781022 0.97810219 0.97810219 0.94160584
|
|
0.96350365 0.89781022 0.94160584 0.95620438]
|
|
|
|
mean value: 0.9451803349076857
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.76923077 0.83333333 0.85714286 0.85714286 0.6
|
|
0.82352941 0.66666667 0.8 0.8 ]
|
|
|
|
mean value: 0.7940379228614522
|
|
|
|
key: train_fscore
|
|
value: [0.96969697 0.92063492 0.88709677 0.97777778 0.97841727 0.93846154
|
|
0.96183206 0.8852459 0.9375 0.95384615]
|
|
|
|
mean value: 0.9410509363506006
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 0.85714286 0.85714286 1.
|
|
0.77777778 1. 0.85714286 0.66666667]
|
|
|
|
mean value: 0.9015873015873016
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.97142857 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9971428571428571
|
|
|
|
key: test_recall
|
|
value: [0.875 0.625 0.71428571 0.85714286 0.85714286 0.42857143
|
|
0.875 0.5 0.75 1. ]
|
|
|
|
mean value: 0.7482142857142857
|
|
|
|
key: train_recall
|
|
value: [0.94117647 0.85294118 0.79710145 0.95652174 0.98550725 0.88405797
|
|
0.92647059 0.79411765 0.88235294 0.91176471]
|
|
|
|
mean value: 0.8932011935208866
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8125 0.85714286 0.86607143 0.86607143 0.71428571
|
|
0.79464286 0.75 0.80357143 0.71428571]
|
|
|
|
mean value: 0.8116071428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.97058824 0.92647059 0.89855072 0.97826087 0.97804774 0.94202899
|
|
0.96323529 0.89705882 0.94117647 0.95588235]
|
|
|
|
mean value: 0.9451300085251492
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.625 0.71428571 0.75 0.75 0.42857143
|
|
0.7 0.5 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6676190476190476
|
|
|
|
key: train_jcc
|
|
value: [0.94117647 0.85294118 0.79710145 0.95652174 0.95774648 0.88405797
|
|
0.92647059 0.79411765 0.88235294 0.91176471]
|
|
|
|
mean value: 0.8904251167705294
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01324773 0.01418781 0.01375175 0.01332355 0.01351047 0.01325345
|
|
0.01305151 0.01286387 0.01371384 0.01300216]
|
|
|
|
mean value: 0.013390612602233887
|
|
|
|
key: score_time
|
|
value: [0.0118475 0.01155829 0.01162076 0.01165366 0.01165652 0.01157427
|
|
0.01155686 0.01157951 0.01150799 0.01157808]
|
|
|
|
mean value: 0.0116133451461792
|
|
|
|
key: test_mcc
|
|
value: [0.37796447 0.48038446 0.75592895 0.64465837 0.49099025 0.64465837
|
|
0.6000992 0.76376262 0.73214286 0.64465837]
|
|
|
|
mean value: 0.6135247918252648
|
|
|
|
key: train_mcc
|
|
value: [0.70321085 0.6799747 0.92951942 0.78854812 0.81433714 0.82543222
|
|
0.8978896 0.89869927 0.88938138 0.88654289]
|
|
|
|
mean value: 0.8313535585121352
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.6875 0.86666667 0.8 0.73333333 0.8
|
|
0.8 0.86666667 0.86666667 0.8 ]
|
|
|
|
mean value: 0.7845833333333334
|
|
|
|
key: train_accuracy
|
|
value: [0.83088235 0.81617647 0.96350365 0.88321168 0.90510949 0.90510949
|
|
0.94890511 0.94890511 0.94160584 0.94160584]
|
|
|
|
mean value: 0.9085015027908974
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.54545455 0.83333333 0.72727273 0.75 0.72727273
|
|
0.82352941 0.85714286 0.875 0.84210526]
|
|
|
|
mean value: 0.7381110865398791
|
|
|
|
key: train_fscore
|
|
value: [0.79646018 0.77477477 0.96240602 0.86885246 0.91034483 0.896
|
|
0.94814815 0.94964029 0.94444444 0.93846154]
|
|
|
|
mean value: 0.8989532672230035
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 0.66666667 1.
|
|
0.77777778 1. 0.875 0.72727273]
|
|
|
|
mean value: 0.9046717171717171
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.86842105 1.
|
|
0.95522388 0.92957746 0.89473684 0.98387097]
|
|
|
|
mean value: 0.9631830207864525
|
|
|
|
key: test_recall
|
|
value: [0.25 0.375 0.71428571 0.57142857 0.85714286 0.57142857
|
|
0.875 0.75 0.875 1. ]
|
|
|
|
mean value: 0.6839285714285714
|
|
|
|
key: train_recall
|
|
value: [0.66176471 0.63235294 0.92753623 0.76811594 0.95652174 0.8115942
|
|
0.94117647 0.97058824 1. 0.89705882]
|
|
|
|
mean value: 0.8566709292412618
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.6875 0.85714286 0.78571429 0.74107143 0.78571429
|
|
0.79464286 0.875 0.86607143 0.78571429]
|
|
|
|
mean value: 0.7803571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.83088235 0.81617647 0.96376812 0.88405797 0.90473146 0.9057971
|
|
0.9488491 0.94906223 0.94202899 0.94128303]
|
|
|
|
mean value: 0.9086636828644501
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.375 0.71428571 0.57142857 0.6 0.57142857
|
|
0.7 0.75 0.77777778 0.72727273]
|
|
|
|
mean value: 0.6037193362193363
|
|
|
|
key: train_jcc
|
|
value: [0.66176471 0.63235294 0.92753623 0.76811594 0.83544304 0.8115942
|
|
0.90140845 0.90410959 0.89473684 0.88405797]
|
|
|
|
mean value: 0.8221119914710179
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11366677 0.0940814 0.10225201 0.10271406 0.09806108 0.09513283
|
|
0.09545851 0.09452772 0.09964037 0.09940076]
|
|
|
|
mean value: 0.09949355125427246
|
|
|
|
key: score_time
|
|
value: [0.01469493 0.01464295 0.02262974 0.01577759 0.01471615 0.0148375
|
|
0.0147779 0.01488066 0.01598573 0.01483297]
|
|
|
|
mean value: 0.01577761173248291
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.76376262 0.87287156 1. 1.
|
|
0.87287156 1. 0.875 1. ]
|
|
|
|
mean value: 0.9159102406955395
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.86666667 0.93333333 1. 1.
|
|
0.93333333 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9541666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 1. 0.875 0.92307692 1. 1.
|
|
0.94117647 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9529729584141349
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.77777778 1. 1. 1.
|
|
0.88888889 1. 1. 1. ]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 1. 0.85714286 1. 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.875 0.92857143 1. 1.
|
|
0.92857143 1. 0.9375 1. ]
|
|
|
|
mean value: 0.9544642857142858
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 1. 0.77777778 0.85714286 1. 1.
|
|
0.88888889 1. 0.875 1. ]
|
|
|
|
mean value: 0.9148809523809524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0341208 0.03071618 0.06870103 0.02894068 0.03026152 0.03227735
|
|
0.03287911 0.0358839 0.04638481 0.04289818]
|
|
|
|
mean value: 0.038306355476379395
|
|
|
|
key: score_time
|
|
value: [0.02494836 0.02263951 0.02276754 0.01839662 0.01910567 0.02346301
|
|
0.02387643 0.02339864 0.03518057 0.03576827]
|
|
|
|
mean value: 0.024954462051391603
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 1. 0.87287156 1. 1.
|
|
0.87287156 1. 0.875 1. ]
|
|
|
|
mean value: 0.9395339791129422
|
|
|
|
key: train_mcc
|
|
value: [0.98540068 0.98540068 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.97120941 0.98550418 1. 0.98550418]
|
|
|
|
mean value: 0.9855048108412058
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 1. 0.93333333 1. 1.
|
|
0.93333333 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9675
|
|
|
|
key: train_accuracy
|
|
value: [0.99264706 0.99264706 0.99270073 0.99270073 0.99270073 0.99270073
|
|
0.98540146 0.99270073 1. 0.99270073]
|
|
|
|
mean value: 0.9926899957063118
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 1. 1. 0.92307692 1. 1.
|
|
0.94117647 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9654729584141348
|
|
|
|
key: train_fscore
|
|
value: [0.99259259 0.99259259 0.99270073 0.99270073 0.99270073 0.99270073
|
|
0.98507463 0.99259259 1. 0.99259259]
|
|
|
|
mean value: 0.9926247916944072
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.88888889 1. 1. 1. ]
|
|
|
|
mean value: 0.9888888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 1. 0.85714286 1. 1.
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_recall
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.97058824 0.98529412 1. 0.98529412]
|
|
|
|
mean value: 0.98537936913896
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 1. 0.92857143 1. 1.
|
|
0.92857143 1. 0.9375 1. ]
|
|
|
|
mean value: 0.9669642857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.99264706 0.99264706 0.99275362 0.99275362 0.99275362 0.99275362
|
|
0.98529412 0.99264706 1. 0.99264706]
|
|
|
|
mean value: 0.9926896845694799
|
|
|
|
key: test_jcc
|
|
value: [0.75 1. 1. 0.85714286 1. 1.
|
|
0.88888889 1. 0.875 1. ]
|
|
|
|
mean value: 0.9371031746031746
|
|
|
|
key: train_jcc
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.97058824 0.98529412 1. 0.98529412]
|
|
|
|
mean value: 0.98537936913896
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03243661 0.04290581 0.07457685 0.05478549 0.05016804 0.04229784
|
|
0.05081439 0.05759549 0.05455852 0.05034065]
|
|
|
|
mean value: 0.05104796886444092
|
|
|
|
key: score_time
|
|
value: [0.01870847 0.02562094 0.02191496 0.0251255 0.03725863 0.0223453
|
|
0.02001977 0.02464914 0.01784778 0.02232647]
|
|
|
|
mean value: 0.023581695556640626
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.51639778 0.47245559 0.21821789 0.64465837 0.6000992
|
|
0.07142857 0.46770717 0.32732684 0.60714286]
|
|
|
|
mean value: 0.4441832047127614
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.73333333 0.6 0.8 0.8
|
|
0.53333333 0.66666667 0.66666667 0.8 ]
|
|
|
|
mean value: 0.71
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.71428571 0.66666667 0.625 0.72727273 0.76923077
|
|
0.53333333 0.54545455 0.70588235 0.8 ]
|
|
|
|
mean value: 0.6801411823470647
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.83333333 0.8 0.55555556 1. 0.83333333
|
|
0.57142857 1. 0.66666667 0.85714286]
|
|
|
|
mean value: 0.795079365079365
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.625 0.625 0.57142857 0.71428571 0.57142857 0.71428571
|
|
0.5 0.375 0.75 0.75 ]
|
|
|
|
mean value: 0.6196428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.75 0.72321429 0.60714286 0.78571429 0.79464286
|
|
0.53571429 0.6875 0.66071429 0.80357143]
|
|
|
|
mean value: 0.7098214285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.55555556 0.5 0.45454545 0.57142857 0.625
|
|
0.36363636 0.375 0.54545455 0.66666667]
|
|
|
|
mean value: 0.5212842712842712
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.26349664 0.30767131 0.26317835 0.27116275 0.31355691 0.28043199
|
|
0.25030661 0.25096774 0.2339437 0.26458788]
|
|
|
|
mean value: 0.26993038654327395
|
|
|
|
key: score_time
|
|
value: [0.01109076 0.01076746 0.00925684 0.01056147 0.01435041 0.01452732
|
|
0.009161 0.00908542 0.00923038 0.00994611]
|
|
|
|
mean value: 0.010797715187072754
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 1. 0.87287156 1. 0.87287156
|
|
0.75592895 1. 0.875 1. ]
|
|
|
|
mean value: 0.9151268737147877
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 1. 0.93333333 1. 0.93333333
|
|
0.86666667 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9541666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 1. 1. 0.92307692 1. 0.92307692
|
|
0.88888889 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9525518925518925
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1. 0.8 1. 1. 1. ]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 1. 0.85714286 1. 0.85714286
|
|
1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.9339285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 1. 0.92857143 1. 0.92857143
|
|
0.85714286 1. 0.9375 1. ]
|
|
|
|
mean value: 0.9526785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 1. 1. 0.85714286 1. 0.85714286
|
|
0.8 1. 0.875 1. ]
|
|
|
|
mean value: 0.9139285714285714
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01642203 0.01763225 0.02696109 0.01796865 0.01799726 0.01788306
|
|
0.01834369 0.02878118 0.01888704 0.02675176]
|
|
|
|
mean value: 0.02076280117034912
|
|
|
|
key: score_time
|
|
value: [0.01238465 0.01226735 0.01235175 0.01354599 0.01394749 0.01234221
|
|
0.01371408 0.01333451 0.01352239 0.01251984]
|
|
|
|
mean value: 0.01299302577972412
|
|
|
|
key: test_mcc
|
|
value: [ 0.12598816 0. -0.05455447 -0.13363062 -0.64465837 0.04029115
|
|
0.49099025 -0.19642857 0.19642857 0.33928571]
|
|
|
|
mean value: 0.016371180845219407
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.5625 0.5 0.46666667 0.46666667 0.2 0.53333333
|
|
0.73333333 0.4 0.6 0.66666667]
|
|
|
|
mean value: 0.5129166666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.5 0.5 0.2 0.33333333 0.36363636
|
|
0.71428571 0.4 0.625 0.66666667]
|
|
|
|
mean value: 0.4891157372039725
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.5 0.44444444 0.33333333 0.27272727 0.5
|
|
0.83333333 0.42857143 0.625 0.71428571]
|
|
|
|
mean value: 0.5207251082251082
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.625 0.5 0.57142857 0.14285714 0.42857143 0.28571429
|
|
0.625 0.375 0.625 0.625 ]
|
|
|
|
mean value: 0.48035714285714287
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5625 0.5 0.47321429 0.44642857 0.21428571 0.51785714
|
|
0.74107143 0.40178571 0.59821429 0.66964286]
|
|
|
|
mean value: 0.5125
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.33333333 0.33333333 0.11111111 0.2 0.22222222
|
|
0.55555556 0.25 0.45454545 0.5 ]
|
|
|
|
mean value: 0.3376767676767677
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02977848 0.03747535 0.04021692 0.03384948 0.03854942 0.03706765
|
|
0.03542995 0.03389931 0.03391671 0.03379488]
|
|
|
|
mean value: 0.035397815704345706
|
|
|
|
key: score_time
|
|
value: [0.02066422 0.02376986 0.02077508 0.02034974 0.02166724 0.02019167
|
|
0.02067709 0.02275515 0.02010059 0.02273393]
|
|
|
|
mean value: 0.02136845588684082
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 1. 0.87287156 0.76376262 0.87287156 0.87287156
|
|
0.64465837 0.66143783 0.73214286 0.53452248]
|
|
|
|
mean value: 0.7585079626960751
|
|
|
|
key: train_mcc
|
|
value: [0.97058824 0.97058824 0.97080136 0.97080136 0.97080136 0.97080136
|
|
0.98550418 0.97080136 0.97080136 0.97080136]
|
|
|
|
mean value: 0.9722290198043756
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 1. 0.93333333 0.86666667 0.93333333 0.93333333
|
|
0.8 0.8 0.86666667 0.73333333]
|
|
|
|
mean value: 0.8679166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.98529412 0.98529412 0.98540146 0.98540146 0.98540146 0.98540146
|
|
0.99270073 0.98540146 0.98540146 0.98540146]
|
|
|
|
mean value: 0.9861099184199227
|
|
|
|
key: test_fscore
|
|
value: [0.8 1. 0.92307692 0.875 0.92307692 0.92307692
|
|
0.84210526 0.76923077 0.875 0.8 ]
|
|
|
|
mean value: 0.8730566801619433
|
|
|
|
key: train_fscore
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.99259259 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9861092166335134
|
|
|
|
key: test_precision
|
|
value: [0.85714286 1. 1. 0.77777778 1. 1.
|
|
0.72727273 1. 0.875 0.66666667]
|
|
|
|
mean value: 0.8903860028860029
|
|
|
|
key: train_precision
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
1. 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9868499573742541
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 0.85714286 1. 0.85714286 0.85714286
|
|
1. 0.625 0.875 1. ]
|
|
|
|
mean value: 0.8821428571428571
|
|
|
|
key: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:168: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:171: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
train_recall
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.98537936913896
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 1. 0.92857143 0.875 0.92857143 0.92857143
|
|
0.78571429 0.8125 0.86607143 0.71428571]
|
|
|
|
mean value: 0.8651785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.98529412 0.98529412 0.98540068 0.98540068 0.98540068 0.98540068
|
|
0.99264706 0.98540068 0.98540068 0.98540068]
|
|
|
|
mean value: 0.9861040068201194
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 1. 0.85714286 0.77777778 0.85714286 0.85714286
|
|
0.72727273 0.625 0.77777778 0.66666667]
|
|
|
|
mean value: 0.7812590187590187
|
|
|
|
key: train_jcc
|
|
value: [0.97101449 0.97101449 0.97142857 0.97142857 0.97142857 0.97142857
|
|
0.98529412 0.97101449 0.97101449 0.97101449]
|
|
|
|
mean value: 0.9726080867129461
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.20923996 0.30740142 0.23965693 0.30709243 0.33208108 0.24986792
|
|
0.23055601 0.22767615 0.24073982 0.21366882]
|
|
|
|
mean value: 0.2557980537414551
|
|
|
|
key: score_time
|
|
value: [0.02369094 0.02021146 0.02024269 0.02145529 0.02396369 0.02035141
|
|
0.02180147 0.02421188 0.0203321 0.02221036]
|
|
|
|
mean value: 0.021847128868103027
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 1. 0.87287156 0.76376262 0.87287156 0.87287156
|
|
0.64465837 0.66143783 0.73214286 0.53452248]
|
|
|
|
mean value: 0.7585079626960751
|
|
|
|
key: train_mcc
|
|
value: [0.97058824 0.97058824 0.97080136 0.97080136 0.97080136 0.97080136
|
|
0.98550418 0.97080136 0.97080136 0.97080136]
|
|
|
|
mean value: 0.9722290198043756
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 1. 0.93333333 0.86666667 0.93333333 0.93333333
|
|
0.8 0.8 0.86666667 0.73333333]
|
|
|
|
mean value: 0.8679166666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.98529412 0.98529412 0.98540146 0.98540146 0.98540146 0.98540146
|
|
0.99270073 0.98540146 0.98540146 0.98540146]
|
|
|
|
mean value: 0.9861099184199227
|
|
|
|
key: test_fscore
|
|
value: [0.8 1. 0.92307692 0.875 0.92307692 0.92307692
|
|
0.84210526 0.76923077 0.875 0.8 ]
|
|
|
|
mean value: 0.8730566801619433
|
|
|
|
key: train_fscore
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.99259259 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9861092166335134
|
|
|
|
key: test_precision
|
|
value: [0.85714286 1. 1. 0.77777778 1. 1.
|
|
0.72727273 1. 0.875 0.66666667]
|
|
|
|
mean value: 0.8903860028860029
|
|
|
|
key: train_precision
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
1. 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.9868499573742541
|
|
|
|
key: test_recall
|
|
value: [0.75 1. 0.85714286 1. 0.85714286 0.85714286
|
|
1. 0.625 0.875 1. ]
|
|
|
|
mean value: 0.8821428571428571
|
|
|
|
key: train_recall
|
|
value: [0.98529412 0.98529412 0.98550725 0.98550725 0.98550725 0.98550725
|
|
0.98529412 0.98529412 0.98529412 0.98529412]
|
|
|
|
mean value: 0.98537936913896
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 1. 0.92857143 0.875 0.92857143 0.92857143
|
|
0.78571429 0.8125 0.86607143 0.71428571]
|
|
|
|
mean value: 0.8651785714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.98529412 0.98529412 0.98540068 0.98540068 0.98540068 0.98540068
|
|
0.99264706 0.98540068 0.98540068 0.98540068]
|
|
|
|
mean value: 0.9861040068201194
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 1. 0.85714286 0.77777778 0.85714286 0.85714286
|
|
0.72727273 0.625 0.77777778 0.66666667]
|
|
|
|
mean value: 0.7812590187590187
|
|
|
|
key: train_jcc
|
|
value: [0.97101449 0.97101449 0.97142857 0.97142857 0.97142857 0.97142857
|
|
0.98529412 0.97101449 0.97101449 0.97101449]
|
|
|
|
mean value: 0.9726080867129461
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04199767 0.03794289 0.03837681 0.05303359 0.15559244 0.13856936
|
|
0.05898094 0.04978466 0.07463431 0.04460335]
|
|
|
|
mean value: 0.06935160160064698
|
|
|
|
key: score_time
|
|
value: [0.01235557 0.01329517 0.01337838 0.01518774 0.02457047 0.03380871
|
|
0.0172708 0.01403213 0.02101779 0.01378417]
|
|
|
|
mean value: 0.017870092391967775
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.93202124 0.8951918 0.82490815 0.71611487 0.78772636
|
|
0.89342711 0.89342711 0.78772636 0.80439967]
|
|
|
|
mean value: 0.8467125880292498
|
|
|
|
key: train_mcc
|
|
value: [0.89746503 0.90927764 0.89754406 0.90138653 0.90163769 0.88213591
|
|
0.93313595 0.90551181 0.92125984 0.92520402]
|
|
|
|
mean value: 0.9074558497858128
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.94736842 0.9122807 0.85714286 0.89285714
|
|
0.94642857 0.94642857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9218045112781955
|
|
|
|
key: train_accuracy
|
|
value: [0.94871795 0.95463511 0.94871795 0.95069034 0.9507874 0.94094488
|
|
0.96653543 0.95275591 0.96062992 0.96259843]
|
|
|
|
mean value: 0.9537013309726816
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.96296296 0.94915254 0.91525424 0.86206897 0.89655172
|
|
0.94736842 0.94736842 0.88888889 0.90322581]
|
|
|
|
mean value: 0.9238359211104228
|
|
|
|
key: train_fscore
|
|
value: [0.9486166 0.95463511 0.94820717 0.95049505 0.95107632 0.94163424
|
|
0.9667319 0.95275591 0.96062992 0.96252465]
|
|
|
|
mean value: 0.9537306872118687
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 0.93333333 0.9 0.83333333 0.86666667
|
|
0.93103448 0.93103448 0.92307692 0.82352941]
|
|
|
|
mean value: 0.9075341967025538
|
|
|
|
key: train_precision
|
|
value: [0.95238095 0.95652174 0.95582329 0.95238095 0.94552529 0.93076923
|
|
0.96108949 0.95275591 0.96062992 0.96442688]
|
|
|
|
mean value: 0.9532303658068488
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.96551724 0.93103448 0.89285714 0.92857143
|
|
0.96428571 0.96428571 0.85714286 1. ]
|
|
|
|
mean value: 0.9432266009852217
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.95275591 0.94071146 0.9486166 0.95669291 0.95275591
|
|
0.97244094 0.95275591 0.96062992 0.96062992]
|
|
|
|
mean value: 0.9542871370327721
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.96428571 0.94704433 0.91194581 0.85714286 0.89285714
|
|
0.94642857 0.94642857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9217364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [0.94872553 0.95463882 0.94870219 0.95068625 0.9507874 0.94094488
|
|
0.96653543 0.95275591 0.96062992 0.96259843]
|
|
|
|
mean value: 0.9537004761756559
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.92857143 0.90322581 0.84375 0.75757576 0.8125
|
|
0.9 0.9 0.8 0.82352941]
|
|
|
|
mean value: 0.8602485737696839
|
|
|
|
key: train_jcc
|
|
value: [0.90225564 0.91320755 0.90151515 0.90566038 0.90671642 0.88970588
|
|
0.93560606 0.90977444 0.92424242 0.92775665]
|
|
|
|
mean value: 0.9116440590335693
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.9814384 1.1654892 1.1827333 1.37071514 1.02586007 1.44153547
|
|
1.04063892 1.14810324 0.88139033 0.93454194]
|
|
|
|
mean value: 1.1172446012496948
|
|
|
|
key: score_time
|
|
value: [0.01375628 0.02266359 0.02546477 0.01354003 0.01353526 0.01361775
|
|
0.01248193 0.0207324 0.01353216 0.01379561]
|
|
|
|
mean value: 0.01631197929382324
|
|
|
|
key: test_mcc
|
|
value: [0.86851042 0.8951918 0.93202124 0.89952865 0.82195294 0.85714286
|
|
0.93094934 0.96490128 0.85933785 0.93094934]
|
|
|
|
mean value: 0.8960485710881759
|
|
|
|
key: train_mcc
|
|
value: [0.99211042 0.98028384 0.98817342 0.98425172 0.98032256 0.99212598
|
|
0.99212598 1. 0.99212598 0.98819663]
|
|
|
|
mean value: 0.9889716545512739
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.96491228 0.94736842 0.91071429 0.92857143
|
|
0.96428571 0.98214286 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9468045112781955
|
|
|
|
key: train_accuracy
|
|
value: [0.99605523 0.99013807 0.99408284 0.99211045 0.99015748 0.99606299
|
|
0.99606299 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9944827532653093
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.94545455 0.96666667 0.95081967 0.9122807 0.92857143
|
|
0.96551724 0.98245614 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9481651453779626
|
|
|
|
key: train_fscore
|
|
value: [0.99606299 0.99013807 0.99408284 0.99212598 0.99013807 0.99606299
|
|
0.99606299 1. 0.99606299 0.99410609]
|
|
|
|
mean value: 0.9944843017488161
|
|
|
|
key: test_precision
|
|
value: [0.875 0.96296296 0.93548387 0.90625 0.89655172 0.92857143
|
|
0.93333333 0.96551724 0.9 0.93333333]
|
|
|
|
mean value: 0.9237003894686041
|
|
|
|
key: train_precision
|
|
value: [0.99606299 0.99209486 0.99212598 0.98823529 0.99209486 0.99606299
|
|
0.99606299 1. 0.99606299 0.99215686]
|
|
|
|
mean value: 0.9940959832938809
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 0.92857143 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.975
|
|
|
|
key: train_recall
|
|
value: [0.99606299 0.98818898 0.99604743 0.99604743 0.98818898 0.99606299
|
|
0.99606299 1. 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9948787775045906
|
|
|
|
key: test_roc_auc
|
|
value: [0.93103448 0.94704433 0.96428571 0.94642857 0.91071429 0.92857143
|
|
0.96428571 0.98214286 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9467364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [0.99605521 0.99014192 0.99408671 0.9921182 0.99015748 0.99606299
|
|
0.99606299 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9944842986523917
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.89655172 0.93548387 0.90625 0.83870968 0.86666667
|
|
0.93333333 0.96551724 0.87096774 0.93333333]
|
|
|
|
mean value: 0.9021813589173155
|
|
|
|
key: train_jcc
|
|
value: [0.99215686 0.98046875 0.98823529 0.984375 0.98046875 0.99215686
|
|
0.99215686 1. 0.99215686 0.98828125]
|
|
|
|
mean value: 0.9890456495098039
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0210526 0.01111865 0.01016021 0.01017928 0.00997281 0.01037288
|
|
0.01188016 0.01034904 0.0100739 0.01008177]
|
|
|
|
mean value: 0.011524128913879394
|
|
|
|
key: score_time
|
|
value: [0.01222992 0.00975513 0.00912452 0.00903606 0.00887966 0.00971317
|
|
0.01039076 0.00900006 0.00902438 0.00893545]
|
|
|
|
mean value: 0.009608912467956542
|
|
|
|
key: test_mcc
|
|
value: [0.75492611 0.69397486 0.75462449 0.58076493 0.50128041 0.75047877
|
|
0.75047877 0.57142857 0.68250015 0.72168784]
|
|
|
|
mean value: 0.6762144912656113
|
|
|
|
key: train_mcc
|
|
value: [0.75216564 0.74385846 0.73986336 0.73178133 0.76786532 0.72461164
|
|
0.69640469 0.7442387 0.79728008 0.70868339]
|
|
|
|
mean value: 0.740675262273998
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.84210526 0.87719298 0.78947368 0.75 0.875
|
|
0.875 0.78571429 0.83928571 0.85714286]
|
|
|
|
mean value: 0.8368107769423558
|
|
|
|
key: train_accuracy
|
|
value: [0.87573964 0.87179487 0.86982249 0.86587771 0.88385827 0.86220472
|
|
0.84448819 0.87204724 0.8976378 0.85433071]
|
|
|
|
mean value: 0.8697801643137804
|
|
|
|
key: test_fscore
|
|
value: [0.87719298 0.82352941 0.88135593 0.78571429 0.75862069 0.87719298
|
|
0.87272727 0.78571429 0.83018868 0.86666667]
|
|
|
|
mean value: 0.8358903188603343
|
|
|
|
key: train_fscore
|
|
value: [0.87861272 0.87378641 0.87109375 0.86614173 0.88499025 0.86381323
|
|
0.83227176 0.87329435 0.90114068 0.85490196]
|
|
|
|
mean value: 0.8700046844178336
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.91304348 0.86666667 0.81481481 0.73333333 0.86206897
|
|
0.88888889 0.78571429 0.88 0.8125 ]
|
|
|
|
mean value: 0.8419099398713341
|
|
|
|
key: train_precision
|
|
value: [0.86037736 0.86206897 0.86100386 0.8627451 0.87644788 0.85384615
|
|
0.90322581 0.86486486 0.87132353 0.8515625 ]
|
|
|
|
mean value: 0.8667466014073157
|
|
|
|
key: test_recall
|
|
value: [0.89285714 0.75 0.89655172 0.75862069 0.78571429 0.89285714
|
|
0.85714286 0.78571429 0.78571429 0.92857143]
|
|
|
|
mean value: 0.8333743842364532
|
|
|
|
key: train_recall
|
|
value: [0.8976378 0.88582677 0.88142292 0.86956522 0.89370079 0.87401575
|
|
0.77165354 0.88188976 0.93307087 0.85826772]
|
|
|
|
mean value: 0.8747051134418474
|
|
|
|
key: test_roc_auc
|
|
value: [0.87746305 0.84051724 0.87684729 0.79002463 0.75 0.875
|
|
0.875 0.78571429 0.83928571 0.85714286]
|
|
|
|
mean value: 0.8366995073891625
|
|
|
|
key: train_roc_auc
|
|
value: [0.87569637 0.87176714 0.86984532 0.86588497 0.88385827 0.86220472
|
|
0.84448819 0.87204724 0.8976378 0.85433071]
|
|
|
|
mean value: 0.8697760729513554
|
|
|
|
key: test_jcc
|
|
value: [0.78125 0.7 0.78787879 0.64705882 0.61111111 0.78125
|
|
0.77419355 0.64705882 0.70967742 0.76470588]
|
|
|
|
mean value: 0.7204184396143599
|
|
|
|
key: train_jcc
|
|
value: [0.78350515 0.77586207 0.7716263 0.76388889 0.79370629 0.76027397
|
|
0.71272727 0.77508651 0.8200692 0.74657534]
|
|
|
|
mean value: 0.7703321000916057
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01048732 0.01034641 0.01061344 0.01354074 0.01033831 0.01015806
|
|
0.01025128 0.01018333 0.01019001 0.01029396]
|
|
|
|
mean value: 0.010640287399291992
|
|
|
|
key: score_time
|
|
value: [0.00899577 0.00907207 0.00908756 0.00907373 0.00891471 0.00884938
|
|
0.00887942 0.00890255 0.00902867 0.00936127]
|
|
|
|
mean value: 0.00901651382446289
|
|
|
|
key: test_mcc
|
|
value: [0.72242731 0.61405719 0.58076493 0.47413793 0.39310793 0.60753044
|
|
0.64951905 0.75047877 0.58501794 0.57142857]
|
|
|
|
mean value: 0.5948470056906343
|
|
|
|
key: train_mcc
|
|
value: [0.61736329 0.62938349 0.62938349 0.63709364 0.64961133 0.59933628
|
|
0.63787438 0.59872224 0.62622211 0.63009708]
|
|
|
|
mean value: 0.6255087330440435
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.80701754 0.78947368 0.73684211 0.69642857 0.80357143
|
|
0.82142857 0.875 0.78571429 0.78571429]
|
|
|
|
mean value: 0.794329573934837
|
|
|
|
key: train_accuracy
|
|
value: [0.8086785 0.81459566 0.81459566 0.81854043 0.82480315 0.7992126
|
|
0.81889764 0.7992126 0.81299213 0.81496063]
|
|
|
|
mean value: 0.8126488996567737
|
|
|
|
key: test_fscore
|
|
value: [0.86153846 0.8 0.78571429 0.73684211 0.70175439 0.80701754
|
|
0.83333333 0.87272727 0.76 0.78571429]
|
|
|
|
mean value: 0.7944641674115358
|
|
|
|
key: train_fscore
|
|
value: [0.8086785 0.812749 0.81640625 0.81746032 0.82445759 0.79352227
|
|
0.8203125 0.796 0.81553398 0.812749 ]
|
|
|
|
mean value: 0.8117869417892003
|
|
|
|
key: test_precision
|
|
value: [0.75675676 0.81481481 0.81481481 0.75 0.68965517 0.79310345
|
|
0.78125 0.88888889 0.86363636 0.78571429]
|
|
|
|
mean value: 0.7938634545315579
|
|
|
|
key: train_precision
|
|
value: [0.81027668 0.82258065 0.80694981 0.82071713 0.82608696 0.81666667
|
|
0.81395349 0.80894309 0.8045977 0.82258065]
|
|
|
|
mean value: 0.8153352810729207
|
|
|
|
key: test_recall
|
|
value: [1. 0.78571429 0.75862069 0.72413793 0.71428571 0.82142857
|
|
0.89285714 0.85714286 0.67857143 0.78571429]
|
|
|
|
mean value: 0.8018472906403941
|
|
|
|
key: train_recall
|
|
value: [0.80708661 0.80314961 0.82608696 0.81422925 0.82283465 0.77165354
|
|
0.82677165 0.78346457 0.82677165 0.80314961]
|
|
|
|
mean value: 0.8085198095297377
|
|
|
|
key: test_roc_auc
|
|
value: [0.84482759 0.80665025 0.79002463 0.73706897 0.69642857 0.80357143
|
|
0.82142857 0.875 0.78571429 0.78571429]
|
|
|
|
mean value: 0.7946428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.80868165 0.81461828 0.81461828 0.81853195 0.82480315 0.7992126
|
|
0.81889764 0.7992126 0.81299213 0.81496063]
|
|
|
|
mean value: 0.8126528897326569
|
|
|
|
key: test_jcc
|
|
value: [0.75675676 0.66666667 0.64705882 0.58333333 0.54054054 0.67647059
|
|
0.71428571 0.77419355 0.61290323 0.64705882]
|
|
|
|
mean value: 0.6619268021070678
|
|
|
|
key: train_jcc
|
|
value: [0.67880795 0.68456376 0.68976898 0.69127517 0.70134228 0.65771812
|
|
0.69536424 0.66112957 0.68852459 0.68456376]
|
|
|
|
mean value: 0.6833058407846723
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01001644 0.01116848 0.01114178 0.01109457 0.01129532 0.01162243
|
|
0.0112431 0.01212406 0.01101089 0.01048851]
|
|
|
|
mean value: 0.01112055778503418
|
|
|
|
key: score_time
|
|
value: [0.01606083 0.01568818 0.01301765 0.01356721 0.01365757 0.01359797
|
|
0.01383781 0.01748276 0.01388144 0.01317573]
|
|
|
|
mean value: 0.01439671516418457
|
|
|
|
key: test_mcc
|
|
value: [0.7366424 0.6166424 0.68434084 0.6317806 0.5118907 0.58501794
|
|
0.53605627 0.68250015 0.58501794 0.46697379]
|
|
|
|
mean value: 0.6036863009651918
|
|
|
|
key: train_mcc
|
|
value: [0.76398832 0.74554603 0.75880927 0.76806178 0.79775247 0.78489793
|
|
0.76354997 0.76417218 0.73925749 0.79155948]
|
|
|
|
mean value: 0.767759492902757
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.80701754 0.84210526 0.80701754 0.75 0.78571429
|
|
0.76785714 0.83928571 0.78571429 0.73214286]
|
|
|
|
mean value: 0.7976503759398497
|
|
|
|
key: train_accuracy
|
|
value: [0.87771203 0.86982249 0.87573964 0.8816568 0.8976378 0.88779528
|
|
0.87992126 0.87992126 0.86811024 0.89370079]
|
|
|
|
mean value: 0.8812017580642656
|
|
|
|
key: test_fscore
|
|
value: [0.87096774 0.79245283 0.84745763 0.83076923 0.77419355 0.80645161
|
|
0.77192982 0.84745763 0.76 0.74576271]
|
|
|
|
mean value: 0.8047442754846815
|
|
|
|
key: train_fscore
|
|
value: [0.88644689 0.87777778 0.88354898 0.88764045 0.90151515 0.89579525
|
|
0.88555347 0.88598131 0.87382298 0.8988764 ]
|
|
|
|
mean value: 0.8876958654685702
|
|
|
|
key: test_precision
|
|
value: [0.79411765 0.84 0.83333333 0.75 0.70588235 0.73529412
|
|
0.75862069 0.80645161 0.86363636 0.70967742]
|
|
|
|
mean value: 0.7797013536529993
|
|
|
|
key: train_precision
|
|
value: [0.82876712 0.82867133 0.82986111 0.84341637 0.86861314 0.83617747
|
|
0.84587814 0.84341637 0.83754513 0.85714286]
|
|
|
|
mean value: 0.841948903606986
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.75 0.86206897 0.93103448 0.85714286 0.89285714
|
|
0.78571429 0.89285714 0.67857143 0.78571429]
|
|
|
|
mean value: 0.8400246305418719
|
|
|
|
key: train_recall
|
|
value: [0.95275591 0.93307087 0.94466403 0.93675889 0.93700787 0.96456693
|
|
0.92913386 0.93307087 0.91338583 0.94488189]
|
|
|
|
mean value: 0.9389296940649218
|
|
|
|
key: test_roc_auc
|
|
value: [0.8614532 0.80603448 0.84174877 0.80480296 0.75 0.78571429
|
|
0.76785714 0.83928571 0.78571429 0.73214286]
|
|
|
|
mean value: 0.797475369458128
|
|
|
|
key: train_roc_auc
|
|
value: [0.87756372 0.86969749 0.87587532 0.88176527 0.8976378 0.88779528
|
|
0.87992126 0.87992126 0.86811024 0.89370079]
|
|
|
|
mean value: 0.8811988422395818
|
|
|
|
key: test_jcc
|
|
value: [0.77142857 0.65625 0.73529412 0.71052632 0.63157895 0.67567568
|
|
0.62857143 0.73529412 0.61290323 0.59459459]
|
|
|
|
mean value: 0.6752116994528734
|
|
|
|
key: train_jcc
|
|
value: [0.79605263 0.78217822 0.79139073 0.7979798 0.82068966 0.81125828
|
|
0.79461279 0.79530201 0.77591973 0.81632653]
|
|
|
|
mean value: 0.7981710380264788
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03288817 0.02851772 0.02310157 0.02252603 0.02221966 0.02230144
|
|
0.02194214 0.02222419 0.0223546 0.02229118]
|
|
|
|
mean value: 0.024036669731140138
|
|
|
|
key: score_time
|
|
value: [0.01327085 0.01344895 0.0130353 0.01236677 0.01236677 0.01230669
|
|
0.01217031 0.01236486 0.01230979 0.01227903]
|
|
|
|
mean value: 0.012591934204101563
|
|
|
|
key: test_mcc
|
|
value: [0.80817326 0.8951918 0.82880708 0.79682005 0.61065803 0.72168784
|
|
0.79385662 0.82195294 0.64450339 0.77459667]
|
|
|
|
mean value: 0.7696247676639518
|
|
|
|
key: train_mcc
|
|
value: [0.82431719 0.86987986 0.8390677 0.85106594 0.84756752 0.84464326
|
|
0.86681377 0.83968318 0.87412415 0.84725158]
|
|
|
|
mean value: 0.8504414138602343
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.94736842 0.9122807 0.89473684 0.80357143 0.85714286
|
|
0.89285714 0.91071429 0.82142857 0.875 ]
|
|
|
|
mean value: 0.8809837092731829
|
|
|
|
key: train_accuracy
|
|
value: [0.9112426 0.93491124 0.91913215 0.92504931 0.92322835 0.92125984
|
|
0.93307087 0.91929134 0.93700787 0.92322835]
|
|
|
|
mean value: 0.924742191989315
|
|
|
|
key: test_fscore
|
|
value: [0.90322581 0.94545455 0.91803279 0.90322581 0.81355932 0.86666667
|
|
0.9 0.9122807 0.81481481 0.88888889]
|
|
|
|
mean value: 0.8866149339401671
|
|
|
|
key: train_fscore
|
|
value: [0.91428571 0.93542074 0.92069632 0.92664093 0.92514395 0.92395437
|
|
0.93436293 0.92130518 0.9375 0.92485549]
|
|
|
|
mean value: 0.9264165644110587
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.96296296 0.875 0.84848485 0.77419355 0.8125
|
|
0.84375 0.89655172 0.84615385 0.8 ]
|
|
|
|
mean value: 0.8483126341891392
|
|
|
|
key: train_precision
|
|
value: [0.88560886 0.92996109 0.90151515 0.90566038 0.90262172 0.89338235
|
|
0.91666667 0.8988764 0.93023256 0.90566038]
|
|
|
|
mean value: 0.9070185556903059
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.96551724 0.96551724 0.85714286 0.92857143
|
|
0.96428571 0.92857143 0.78571429 1. ]
|
|
|
|
mean value: 0.9323891625615763
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.94094488 0.94071146 0.9486166 0.9488189 0.95669291
|
|
0.95275591 0.94488189 0.94488189 0.94488189]
|
|
|
|
mean value: 0.9468068220721422
|
|
|
|
key: test_roc_auc
|
|
value: [0.89655172 0.94704433 0.91133005 0.89347291 0.80357143 0.85714286
|
|
0.89285714 0.91071429 0.82142857 0.875 ]
|
|
|
|
mean value: 0.8809113300492611
|
|
|
|
key: train_roc_auc
|
|
value: [0.91117612 0.93489932 0.91917463 0.9250957 0.92322835 0.92125984
|
|
0.93307087 0.91929134 0.93700787 0.92322835]
|
|
|
|
mean value: 0.924743238616912
|
|
|
|
key: test_jcc
|
|
value: [0.82352941 0.89655172 0.84848485 0.82352941 0.68571429 0.76470588
|
|
0.81818182 0.83870968 0.6875 0.8 ]
|
|
|
|
mean value: 0.7986907059820592
|
|
|
|
key: train_jcc
|
|
value: [0.84210526 0.87867647 0.85304659 0.86330935 0.86071429 0.85865724
|
|
0.87681159 0.85409253 0.88235294 0.86021505]
|
|
|
|
mean value: 0.8629981326609936
|
|
|
|
MCC on Blind test: 0.66
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.15022874 2.26373672 2.10535789 1.289114 2.09971786 2.04547882
|
|
2.0241437 2.11169195 2.62805915 2.49180293]
|
|
|
|
mean value: 2.120933175086975
|
|
|
|
key: score_time
|
|
value: [0.01304388 0.01381183 0.02200556 0.01263905 0.01398277 0.03204012
|
|
0.01388478 0.01397943 0.02038717 0.0143764 ]
|
|
|
|
mean value: 0.01701509952545166
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.89988258 0.96547546 0.83703659 0.79385662 0.89342711
|
|
0.96490128 0.96490128 0.93094934 0.8660254 ]
|
|
|
|
mean value: 0.908197290053634
|
|
|
|
key: train_mcc
|
|
value: [0.99606293 0.99211042 1. 0.99211042 1. 0.99607071
|
|
0.99607071 1. 0.99607071 0.99212598]
|
|
|
|
mean value: 0.9960621896302765
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.98245614 0.9122807 0.89285714 0.94642857
|
|
0.98214286 0.98214286 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9520989974937343
|
|
|
|
key: train_accuracy
|
|
value: [0.99802761 0.99605523 1. 0.99605523 1. 0.9980315
|
|
0.9980315 1. 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9980295547376105
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94915254 0.98305085 0.92063492 0.9 0.94545455
|
|
0.98245614 0.98245614 0.96551724 0.93333333]
|
|
|
|
mean value: 0.9544511851685249
|
|
|
|
key: train_fscore
|
|
value: [0.99803536 0.99606299 1. 0.99604743 1. 0.99803536
|
|
0.99803536 1. 0.99803536 0.99606299]
|
|
|
|
mean value: 0.9980314868913049
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.90322581 0.96666667 0.85294118 0.84375 0.96296296
|
|
0.96551724 0.96551724 0.93333333 0.875 ]
|
|
|
|
mean value: 0.9234431670023096
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.99606299 1. 0.99604743 1. 0.99607843
|
|
0.99607843 1. 0.99607843 0.99606299]
|
|
|
|
mean value: 0.9972487140572204
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9892857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 0.99606299 1. 0.99604743 1. 1.
|
|
1. 1. 1. 0.99606299]
|
|
|
|
mean value: 0.9988173415082008
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.94827586 0.98214286 0.91071429 0.89285714 0.94642857
|
|
0.98214286 0.98214286 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9520320197044335
|
|
|
|
key: train_roc_auc
|
|
value: [0.99802372 0.99605521 1. 0.99605521 1. 0.9980315
|
|
0.9980315 1. 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9980291618686005
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.90322581 0.96666667 0.85294118 0.81818182 0.89655172
|
|
0.96551724 0.96551724 0.93333333 0.875 ]
|
|
|
|
mean value: 0.9142452249379882
|
|
|
|
key: train_jcc
|
|
value: [0.99607843 0.99215686 1. 0.99212598 1. 0.99607843
|
|
0.99607843 1. 0.99607843 0.99215686]
|
|
|
|
mean value: 0.9960753435232361
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02960896 0.02699208 0.02417684 0.03163671 0.02562284 0.02379894
|
|
0.03221059 0.02416635 0.02949929 0.02938581]
|
|
|
|
mean value: 0.02770984172821045
|
|
|
|
key: score_time
|
|
value: [0.01288438 0.01086068 0.00921392 0.0142808 0.0104208 0.01039267
|
|
0.01026678 0.00972581 0.01522899 0.01317906]
|
|
|
|
mean value: 0.01164538860321045
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 0.96547546 0.96547546 0.89342711 0.96490128
|
|
0.96490128 0.93094934 0.89342711 0.96490128]
|
|
|
|
mean value: 0.9508975559462645
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 0.98245614 0.98245614 0.94642857 0.98214286
|
|
0.98214286 0.96428571 0.94642857 0.98214286]
|
|
|
|
mean value: 0.975093984962406
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 0.98305085 0.98305085 0.94736842 0.98245614
|
|
0.98245614 0.96551724 0.94736842 0.98245614]
|
|
|
|
mean value: 0.9756180339803336
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.93103448 0.96551724
|
|
0.96551724 0.93333333 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9590804597701149
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 1.
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9928571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 0.98214286 0.98214286 0.94642857 0.98214286
|
|
0.98214286 0.96428571 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9750615763546799
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.9 0.96551724
|
|
0.96551724 0.93333333 0.9 0.96551724]
|
|
|
|
mean value: 0.9528735632183908
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16124463 0.13249683 0.12512231 0.123142 0.12262487 0.12115049
|
|
0.1241188 0.12215066 0.12319398 0.12175512]
|
|
|
|
mean value: 0.12769997119903564
|
|
|
|
key: score_time
|
|
value: [0.02338743 0.01843238 0.02020693 0.01953816 0.01848555 0.01992226
|
|
0.01976848 0.01945233 0.0198977 0.0199244 ]
|
|
|
|
mean value: 0.019901561737060546
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.96547546 0.96547546 0.89342711 0.93094934
|
|
1. 0.96490128 0.89342711 0.93094934]
|
|
|
|
mean value: 0.9544605091626567
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.98245614 0.98245614 0.94642857 0.96428571
|
|
1. 0.98214286 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9768483709273182
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.98305085 0.98305085 0.94736842 0.96296296
|
|
1. 0.98245614 0.94736842 0.96551724]
|
|
|
|
mean value: 0.9771774881713668
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96666667 0.96666667 0.93103448 1.
|
|
1. 0.96551724 0.93103448 0.93333333]
|
|
|
|
mean value: 0.9694252873563218
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.98214286 0.98214286 0.94642857 0.96428571
|
|
1. 0.98214286 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9767857142857144
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.96666667 0.96666667 0.9 0.92857143
|
|
1. 0.96551724 0.9 0.93333333]
|
|
|
|
mean value: 0.9560755336617406
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01061201 0.01145554 0.01236582 0.01151681 0.01169515 0.01167917
|
|
0.01173377 0.01138639 0.01171923 0.0116694 ]
|
|
|
|
mean value: 0.011583328247070312
|
|
|
|
key: score_time
|
|
value: [0.00956845 0.00910449 0.01007557 0.00983143 0.00965309 0.00978756
|
|
0.00995731 0.00933337 0.01028538 0.00958109]
|
|
|
|
mean value: 0.009717774391174317
|
|
|
|
key: test_mcc
|
|
value: [0.80817326 0.86851042 0.89952865 0.86789789 0.64116714 0.85714286
|
|
0.89802651 0.8660254 0.89342711 0.83484711]
|
|
|
|
mean value: 0.8434746352564939
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.92982456 0.94736842 0.92982456 0.80357143 0.92857143
|
|
0.94642857 0.92857143 0.94642857 0.91071429]
|
|
|
|
mean value: 0.9166040100250626
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90322581 0.93333333 0.95081967 0.93548387 0.83076923 0.92857143
|
|
0.94915254 0.93333333 0.94736842 0.91803279]
|
|
|
|
mean value: 0.9230090425868587
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.875 0.90625 0.87878788 0.72972973 0.92857143
|
|
0.90322581 0.875 0.93103448 0.84848485]
|
|
|
|
mean value: 0.8699613586548824
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.89655172 0.93103448 0.94642857 0.92857143 0.80357143 0.92857143
|
|
0.94642857 0.92857143 0.94642857 0.91071429]
|
|
|
|
mean value: 0.9166871921182267
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.82352941 0.875 0.90625 0.87878788 0.71052632 0.86666667
|
|
0.90322581 0.875 0.9 0.84848485]
|
|
|
|
mean value: 0.8587470927945187
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.88762093 1.83851171 1.77748513 1.75392604 1.80487251 1.68222833
|
|
1.85387468 1.78196788 1.73018646 1.72484589]
|
|
|
|
mean value: 1.7835519552230834
|
|
|
|
key: score_time
|
|
value: [0.10137248 0.10127664 0.10406446 0.0981307 0.09335041 0.10171962
|
|
0.10252476 0.09720516 0.09547591 0.09470224]
|
|
|
|
mean value: 0.09898223876953124
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 0.96547546 0.96547546 0.89342711 0.93094934
|
|
1. 0.96490128 0.89342711 0.96490128]
|
|
|
|
mean value: 0.954407427810863
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 0.98245614 0.98245614 0.94642857 0.96428571
|
|
1. 0.98214286 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9768796992481202
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 0.98305085 0.98305085 0.94736842 0.96296296
|
|
1. 0.98245614 0.94736842 0.98245614]
|
|
|
|
mean value: 0.9771169921036111
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.93103448 1.
|
|
1. 0.96551724 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9691954022988506
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 0.98214286 0.98214286 0.94642857 0.96428571
|
|
1. 0.98214286 0.94642857 0.98214286]
|
|
|
|
mean value: 0.9768472906403942
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.9 0.92857143
|
|
1. 0.96551724 0.9 0.96551724]
|
|
|
|
mean value: 0.9558456486042693
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.95136046 0.98831916 0.95716357 0.93647122 0.99228835 1.02972484
|
|
0.97567821 0.97857833 0.98092842 0.9740994 ]
|
|
|
|
mean value: 0.9764611959457398
|
|
|
|
key: score_time
|
|
value: [0.24091625 0.2508409 0.16018391 0.22381854 0.2695148 0.21784067
|
|
0.22880816 0.24814248 0.26897693 0.26349545]
|
|
|
|
mean value: 0.23725380897521972
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.93202124 0.96547546 1. 0.93094934 0.93094934
|
|
1. 0.96490128 0.93094934 0.93094934]
|
|
|
|
mean value: 0.9551712565684981
|
|
|
|
key: train_mcc
|
|
value: [0.98046604 0.9685613 0.98046755 0.97660594 0.98437404 0.97665048
|
|
0.98050495 0.98437404 0.98050495 0.98050495]
|
|
|
|
mean value: 0.979301423744519
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.98245614 1. 0.96428571 0.96428571
|
|
1. 0.98214286 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9769110275689223
|
|
|
|
key: train_accuracy
|
|
value: [0.99013807 0.98422091 0.99013807 0.98816568 0.99212598 0.98818898
|
|
0.99015748 0.99212598 0.99015748 0.99015748]
|
|
|
|
mean value: 0.9895576107720263
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.96296296 0.98305085 1. 0.96551724 0.96296296
|
|
1. 0.98245614 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9770440778223238
|
|
|
|
key: train_fscore
|
|
value: [0.99025341 0.984375 0.99021526 0.98828125 0.9921875 0.98832685
|
|
0.99025341 0.9921875 0.99025341 0.99025341]
|
|
|
|
mean value: 0.9896587007661066
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 1. 0.93333333 1.
|
|
1. 0.96551724 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9697701149425287
|
|
|
|
key: train_precision
|
|
value: [0.98069498 0.97674419 0.98062016 0.97683398 0.98449612 0.97692308
|
|
0.98069498 0.98449612 0.98069498 0.98069498]
|
|
|
|
mean value: 0.9802893565684263
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 1. 0.92857143
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 0.99212598 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.96428571 0.98214286 1. 0.96428571 0.96428571
|
|
1. 0.98214286 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9768472906403941
|
|
|
|
key: train_roc_auc
|
|
value: [0.99011858 0.98420528 0.99015748 0.98818898 0.99212598 0.98818898
|
|
0.99015748 0.99212598 0.99015748 0.99015748]
|
|
|
|
mean value: 0.9895583704210887
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.92857143 0.96666667 1. 0.93333333 0.92857143
|
|
1. 0.96551724 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9554844006568145
|
|
|
|
key: train_jcc
|
|
value: [0.98069498 0.96923077 0.98062016 0.97683398 0.98449612 0.97692308
|
|
0.98069498 0.98449612 0.98069498 0.98069498]
|
|
|
|
mean value: 0.9795380148868521
|
|
|
|
MCC on Blind test: 0.83
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01103711 0.01071525 0.01178098 0.01100349 0.01181769 0.0120666
|
|
0.01212502 0.01112056 0.01161766 0.01164937]
|
|
|
|
mean value: 0.011493372917175292
|
|
|
|
key: score_time
|
|
value: [0.01010919 0.00939178 0.00976348 0.01021886 0.01012588 0.0094111
|
|
0.00998759 0.00937343 0.00957465 0.00951076]
|
|
|
|
mean value: 0.009746670722961426
|
|
|
|
key: test_mcc
|
|
value: [0.72242731 0.61405719 0.58076493 0.47413793 0.39310793 0.60753044
|
|
0.64951905 0.75047877 0.58501794 0.57142857]
|
|
|
|
mean value: 0.5948470056906343
|
|
|
|
key: train_mcc
|
|
value: [0.61736329 0.62938349 0.62938349 0.63709364 0.64961133 0.59933628
|
|
0.63787438 0.59872224 0.62622211 0.63009708]
|
|
|
|
mean value: 0.6255087330440435
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.80701754 0.78947368 0.73684211 0.69642857 0.80357143
|
|
0.82142857 0.875 0.78571429 0.78571429]
|
|
|
|
mean value: 0.794329573934837
|
|
|
|
key: train_accuracy
|
|
value: [0.8086785 0.81459566 0.81459566 0.81854043 0.82480315 0.7992126
|
|
0.81889764 0.7992126 0.81299213 0.81496063]
|
|
|
|
mean value: 0.8126488996567737
|
|
|
|
key: test_fscore
|
|
value: [0.86153846 0.8 0.78571429 0.73684211 0.70175439 0.80701754
|
|
0.83333333 0.87272727 0.76 0.78571429]
|
|
|
|
mean value: 0.7944641674115358
|
|
|
|
key: train_fscore
|
|
value: [0.8086785 0.812749 0.81640625 0.81746032 0.82445759 0.79352227
|
|
0.8203125 0.796 0.81553398 0.812749 ]
|
|
|
|
mean value: 0.8117869417892003
|
|
|
|
key: test_precision
|
|
value: [0.75675676 0.81481481 0.81481481 0.75 0.68965517 0.79310345
|
|
0.78125 0.88888889 0.86363636 0.78571429]
|
|
|
|
mean value: 0.7938634545315579
|
|
|
|
key: train_precision
|
|
value: [0.81027668 0.82258065 0.80694981 0.82071713 0.82608696 0.81666667
|
|
0.81395349 0.80894309 0.8045977 0.82258065]
|
|
|
|
mean value: 0.8153352810729207
|
|
|
|
key: test_recall
|
|
value: [1. 0.78571429 0.75862069 0.72413793 0.71428571 0.82142857
|
|
0.89285714 0.85714286 0.67857143 0.78571429]
|
|
|
|
mean value: 0.8018472906403941
|
|
|
|
key: train_recall
|
|
value: [0.80708661 0.80314961 0.82608696 0.81422925 0.82283465 0.77165354
|
|
0.82677165 0.78346457 0.82677165 0.80314961]
|
|
|
|
mean value: 0.8085198095297377
|
|
|
|
key: test_roc_auc
|
|
value: [0.84482759 0.80665025 0.79002463 0.73706897 0.69642857 0.80357143
|
|
0.82142857 0.875 0.78571429 0.78571429]
|
|
|
|
mean value: 0.7946428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.80868165 0.81461828 0.81461828 0.81853195 0.82480315 0.7992126
|
|
0.81889764 0.7992126 0.81299213 0.81496063]
|
|
|
|
mean value: 0.8126528897326569
|
|
|
|
key: test_jcc
|
|
value: [0.75675676 0.66666667 0.64705882 0.58333333 0.54054054 0.67647059
|
|
0.71428571 0.77419355 0.61290323 0.64705882]
|
|
|
|
mean value: 0.6619268021070678
|
|
|
|
key: train_jcc
|
|
value: [0.67880795 0.68456376 0.68976898 0.69127517 0.70134228 0.65771812
|
|
0.69536424 0.66112957 0.68852459 0.68456376]
|
|
|
|
mean value: 0.6833058407846723
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.0972321 0.07731676 0.07649851 0.0709219 0.07835388 0.07469583
|
|
0.0731678 0.22985172 0.07314754 0.074085 ]
|
|
|
|
mean value: 0.09252710342407226
|
|
|
|
key: score_time
|
|
value: [0.01207376 0.01114559 0.01213479 0.01066709 0.01160574 0.01146913
|
|
0.01133513 0.01144457 0.01144123 0.01136351]
|
|
|
|
mean value: 0.011468052864074707
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 0.96547546 0.96547546 0.92857143 0.96490128
|
|
1. 0.96490128 0.93094934 0.96490128]
|
|
|
|
mean value: 0.9650692763304416
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 0.98245614 0.98245614 0.96428571 0.98214286
|
|
1. 0.98214286 0.96428571 0.98214286]
|
|
|
|
mean value: 0.9822368421052632
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 0.98305085 0.98305085 0.96428571 0.98245614
|
|
1. 0.98245614 0.96551724 0.98245614]
|
|
|
|
mean value: 0.9825729211983787
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.96428571 0.96551724
|
|
1. 0.96551724 0.93333333 0.96551724]
|
|
|
|
mean value: 0.9693021346469622
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9964285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 0.98214286 0.98214286 0.96428571 0.98214286
|
|
1. 0.98214286 0.96428571 0.98214286]
|
|
|
|
mean value: 0.9822044334975369
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.93103448 0.96551724
|
|
1. 0.96551724 0.93333333 0.96551724]
|
|
|
|
mean value: 0.9659770114942529
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.045156 0.05143094 0.04170799 0.07557249 0.04485083 0.05265903
|
|
0.04891658 0.07405901 0.04442978 0.07700086]
|
|
|
|
mean value: 0.05557835102081299
|
|
|
|
key: score_time
|
|
value: [0.01947165 0.01236486 0.01648664 0.01348376 0.02269197 0.01272631
|
|
0.02324986 0.01244068 0.01243663 0.01249099]
|
|
|
|
mean value: 0.015784335136413575
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.85960591 0.79110556 0.89952865 0.85933785 1.
|
|
0.82618439 0.89802651 0.89802651 0.8660254 ]
|
|
|
|
mean value: 0.8863358024357655
|
|
|
|
key: train_mcc
|
|
value: [0.96055211 0.96847134 0.95661443 0.95661511 0.9645744 0.96062992
|
|
0.9645744 0.96850394 0.96062992 0.96062992]
|
|
|
|
mean value: 0.9621795507209824
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.92982456 0.89473684 0.94736842 0.92857143 1.
|
|
0.91071429 0.94642857 0.94642857 0.92857143]
|
|
|
|
mean value: 0.9415100250626567
|
|
|
|
key: train_accuracy
|
|
value: [0.98027613 0.98422091 0.97830375 0.97830375 0.98228346 0.98031496
|
|
0.98228346 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9810868316016711
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.92857143 0.9 0.95081967 0.92592593 1.
|
|
0.91525424 0.94915254 0.94915254 0.93333333]
|
|
|
|
mean value: 0.9434665822346611
|
|
|
|
key: train_fscore
|
|
value: [0.98031496 0.98431373 0.97821782 0.97830375 0.98231827 0.98031496
|
|
0.98231827 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.98109836480702
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.92857143 0.87096774 0.90625 0.96153846 1.
|
|
0.87096774 0.90322581 0.90322581 0.875 ]
|
|
|
|
mean value: 0.9185264228263395
|
|
|
|
key: train_precision
|
|
value: [0.98031496 0.98046875 0.98015873 0.97637795 0.98039216 0.98031496
|
|
0.98039216 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9803301557663748
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.93103448 1. 0.89285714 1.
|
|
0.96428571 1. 1. 1. ]
|
|
|
|
mean value: 0.9716748768472907
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98818898 0.97628458 0.98023715 0.98425197 0.98031496
|
|
0.98425197 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9818726463539884
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.92980296 0.89408867 0.94642857 0.92857143 1.
|
|
0.91071429 0.94642857 0.94642857 0.92857143]
|
|
|
|
mean value: 0.9413793103448276
|
|
|
|
key: train_roc_auc
|
|
value: [0.98027606 0.98421307 0.97829977 0.97830755 0.98228346 0.98031496
|
|
0.98228346 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9810860228439825
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.86666667 0.81818182 0.90625 0.86206897 1.
|
|
0.84375 0.90322581 0.90322581 0.875 ]
|
|
|
|
mean value: 0.8943886304648262
|
|
|
|
key: train_jcc
|
|
value: [0.96138996 0.96911197 0.95736434 0.95752896 0.96525097 0.96138996
|
|
0.96525097 0.96899225 0.96138996 0.96138996]
|
|
|
|
mean value: 0.962905929184999
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01605272 0.01064062 0.01053286 0.01062799 0.01046133 0.01057339
|
|
0.01031685 0.01173782 0.01179981 0.01043272]
|
|
|
|
mean value: 0.011317610740661621
|
|
|
|
key: score_time
|
|
value: [0.01308727 0.0094192 0.00935388 0.00926232 0.0092082 0.00946975
|
|
0.00921726 0.00949192 0.00977159 0.00906062]
|
|
|
|
mean value: 0.009734201431274413
|
|
|
|
key: test_mcc
|
|
value: [0.70694956 0.79682005 0.61405719 0.54433498 0.35805744 0.57735027
|
|
0.61065803 0.4645821 0.61706091 0.61065803]
|
|
|
|
mean value: 0.59005285401838
|
|
|
|
key: train_mcc
|
|
value: [0.65069271 0.60967718 0.61360065 0.56269586 0.67365136 0.57949966
|
|
0.61061966 0.63009708 0.67887215 0.56699945]
|
|
|
|
mean value: 0.6176405758990616
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.89473684 0.80701754 0.77192982 0.67857143 0.78571429
|
|
0.80357143 0.73214286 0.80357143 0.80357143]
|
|
|
|
mean value: 0.7922932330827067
|
|
|
|
key: train_accuracy
|
|
value: [0.82445759 0.80473373 0.80670611 0.78106509 0.83661417 0.78937008
|
|
0.80511811 0.81496063 0.83858268 0.78346457]
|
|
|
|
mean value: 0.8085072760875305
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.88461538 0.81355932 0.77192982 0.68965517 0.8
|
|
0.81355932 0.73684211 0.78431373 0.81355932]
|
|
|
|
mean value: 0.7965177035588487
|
|
|
|
key: train_fscore
|
|
value: [0.83111954 0.80776699 0.80859375 0.78529981 0.83945841 0.79462572
|
|
0.80851064 0.81712062 0.84410646 0.78515625]
|
|
|
|
mean value: 0.812175819990016
|
|
|
|
key: test_precision
|
|
value: [0.77142857 0.95833333 0.8 0.78571429 0.66666667 0.75
|
|
0.77419355 0.72413793 0.86956522 0.77419355]
|
|
|
|
mean value: 0.7874233102342838
|
|
|
|
key: train_precision
|
|
value: [0.8021978 0.79693487 0.7992278 0.76893939 0.82509506 0.7752809
|
|
0.79467681 0.80769231 0.81617647 0.77906977]
|
|
|
|
mean value: 0.7965291168982057
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.82142857 0.82758621 0.75862069 0.71428571 0.85714286
|
|
0.85714286 0.75 0.71428571 0.85714286]
|
|
|
|
mean value: 0.812192118226601
|
|
|
|
key: train_recall
|
|
value: [0.86220472 0.81889764 0.81818182 0.80237154 0.85433071 0.81496063
|
|
0.82283465 0.82677165 0.87401575 0.79133858]
|
|
|
|
mean value: 0.8285907690392456
|
|
|
|
key: test_roc_auc
|
|
value: [0.84421182 0.89347291 0.80665025 0.77216749 0.67857143 0.78571429
|
|
0.80357143 0.73214286 0.80357143 0.80357143]
|
|
|
|
mean value: 0.7923645320197044
|
|
|
|
key: train_roc_auc
|
|
value: [0.82438299 0.80470574 0.8067287 0.78110703 0.83661417 0.78937008
|
|
0.80511811 0.81496063 0.83858268 0.78346457]
|
|
|
|
mean value: 0.8085034701689957
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.79310345 0.68571429 0.62857143 0.52631579 0.66666667
|
|
0.68571429 0.58333333 0.64516129 0.68571429]
|
|
|
|
mean value: 0.6650294813786413
|
|
|
|
key: train_jcc
|
|
value: [0.71103896 0.67752443 0.67868852 0.64649682 0.72333333 0.65923567
|
|
0.67857143 0.69078947 0.73026316 0.64630225]
|
|
|
|
mean value: 0.6842244043960553
|
|
|
|
MCC on Blind test: 0.59
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01944375 0.02495122 0.02653742 0.0216639 0.03189659 0.02602649
|
|
0.02330875 0.02988434 0.02322054 0.03054476]
|
|
|
|
mean value: 0.02574777603149414
|
|
|
|
key: score_time
|
|
value: [0.01058817 0.01182628 0.01213336 0.01234889 0.01249361 0.01251531
|
|
0.012465 0.01516986 0.0150609 0.02132058]
|
|
|
|
mean value: 0.013592195510864259
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.8951918 0.7366424 0.89952865 0.78772636 0.89342711
|
|
0.52223297 0.92857143 0.70082556 0.8660254 ]
|
|
|
|
mean value: 0.8130054254678469
|
|
|
|
key: train_mcc
|
|
value: [0.92712676 0.95292731 0.93792915 0.90342654 0.96074906 0.95670033
|
|
0.6780635 0.97250878 0.89014893 0.96853396]
|
|
|
|
mean value: 0.9148114321896691
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.85964912 0.94736842 0.89285714 0.94642857
|
|
0.71428571 0.96428571 0.83928571 0.92857143]
|
|
|
|
mean value: 0.8987468671679197
|
|
|
|
key: train_accuracy
|
|
value: [0.96252465 0.97633136 0.96844181 0.95069034 0.98031496 0.97834646
|
|
0.81496063 0.98622047 0.94291339 0.98425197]
|
|
|
|
mean value: 0.9544996039696222
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.94545455 0.84615385 0.95081967 0.89655172 0.94545455
|
|
0.77777778 0.96428571 0.81632653 0.93333333]
|
|
|
|
mean value: 0.9025310231713968
|
|
|
|
key: train_fscore
|
|
value: [0.96380952 0.9766537 0.96761134 0.95219885 0.98046875 0.978389
|
|
0.84385382 0.98613861 0.93995859 0.98418972]
|
|
|
|
mean value: 0.9573271907059853
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.96296296 0.95652174 0.90625 0.86666667 0.96296296
|
|
0.63636364 0.96428571 0.95238095 0.875 ]
|
|
|
|
mean value: 0.8986620441204943
|
|
|
|
key: train_precision
|
|
value: [0.93357934 0.96538462 0.99170124 0.92222222 0.97286822 0.97647059
|
|
0.72988506 0.99203187 0.99126638 0.98809524]
|
|
|
|
mean value: 0.9463504767125346
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 0.75862069 1. 0.92857143 0.92857143
|
|
1. 0.96428571 0.71428571 1. ]
|
|
|
|
mean value: 0.9222906403940887
|
|
|
|
key: train_recall
|
|
value: [0.99606299 0.98818898 0.94466403 0.98418972 0.98818898 0.98031496
|
|
1. 0.98031496 0.89370079 0.98031496]
|
|
|
|
mean value: 0.973594036911394
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.94704433 0.8614532 0.94642857 0.89285714 0.94642857
|
|
0.71428571 0.96428571 0.83928571 0.92857143]
|
|
|
|
mean value: 0.8988916256157635
|
|
|
|
key: train_roc_auc
|
|
value: [0.96245837 0.97630793 0.96839501 0.95075628 0.98031496 0.97834646
|
|
0.81496063 0.98622047 0.94291339 0.98425197]
|
|
|
|
mean value: 0.9544925461392425
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.89655172 0.73333333 0.90625 0.8125 0.89655172
|
|
0.63636364 0.93103448 0.68965517 0.875 ]
|
|
|
|
mean value: 0.8280465879596859
|
|
|
|
key: train_jcc
|
|
value: [0.93014706 0.95437262 0.9372549 0.90875912 0.96168582 0.95769231
|
|
0.72988506 0.97265625 0.88671875 0.9688716 ]
|
|
|
|
mean value: 0.920804349269515
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02200437 0.02010751 0.01893258 0.01999164 0.02031374 0.01846647
|
|
0.02055311 0.02092481 0.01913881 0.02208757]
|
|
|
|
mean value: 0.020252060890197755
|
|
|
|
key: score_time
|
|
value: [0.01222563 0.01219106 0.01215363 0.0128088 0.01377892 0.01212907
|
|
0.01220179 0.0122571 0.01221776 0.01227355]
|
|
|
|
mean value: 0.01242372989654541
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.86789789 0.83703659 0.74822828 0.72168784 0.71611487
|
|
0.72168784 0.96490128 0.73127242 0.89802651]
|
|
|
|
mean value: 0.8172370767372027
|
|
|
|
key: train_mcc
|
|
value: [0.77941536 0.90393669 0.89862256 0.80222203 0.96463421 0.86255889
|
|
0.91852667 0.9645744 0.84762399 0.96137528]
|
|
|
|
mean value: 0.8903490079543654
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.92982456 0.9122807 0.85964912 0.85714286 0.85714286
|
|
0.85714286 0.98214286 0.85714286 0.94642857]
|
|
|
|
mean value: 0.9041353383458646
|
|
|
|
key: train_accuracy
|
|
value: [0.87968442 0.95069034 0.94674556 0.89151874 0.98228346 0.92716535
|
|
0.95866142 0.98228346 0.91929134 0.98031496]
|
|
|
|
mean value: 0.9418639053254438
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.92307692 0.92063492 0.87878788 0.86666667 0.86206897
|
|
0.86666667 0.98245614 0.84 0.94915254]
|
|
|
|
mean value: 0.9071966844424932
|
|
|
|
key: train_fscore
|
|
value: [0.86474501 0.94887526 0.94934334 0.90196078 0.98238748 0.93186004
|
|
0.9596929 0.98231827 0.91295117 0.98069498]
|
|
|
|
mean value: 0.941482922079735
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.85294118 0.78378378 0.8125 0.83333333
|
|
0.8125 0.96551724 0.95454545 0.90322581]
|
|
|
|
mean value: 0.8883864037343394
|
|
|
|
key: train_precision
|
|
value: [0.98984772 0.98723404 0.90357143 0.82142857 0.9766537 0.87543253
|
|
0.93632959 0.98039216 0.99078341 0.96212121]
|
|
|
|
mean value: 0.9423794347876031
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 1. 0.92857143 0.89285714
|
|
0.92857143 1. 0.75 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_recall
|
|
value: [0.76771654 0.91338583 1. 1. 0.98818898 0.99606299
|
|
0.98425197 0.98425197 0.84645669 1. ]
|
|
|
|
mean value: 0.9480314960629921
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.92857143 0.91071429 0.85714286 0.85714286 0.85714286
|
|
0.85714286 0.98214286 0.85714286 0.94642857]
|
|
|
|
mean value: 0.9036330049261084
|
|
|
|
key: train_roc_auc
|
|
value: [0.8799057 0.95076406 0.94685039 0.89173228 0.98228346 0.92716535
|
|
0.95866142 0.98228346 0.91929134 0.98031496]
|
|
|
|
mean value: 0.9419252435342815
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.85714286 0.85294118 0.78378378 0.76470588 0.75757576
|
|
0.76470588 0.96551724 0.72413793 0.90322581]
|
|
|
|
mean value: 0.8339253559923585
|
|
|
|
key: train_jcc
|
|
value: [0.76171875 0.90272374 0.90357143 0.82142857 0.96538462 0.87241379
|
|
0.92250923 0.96525097 0.83984375 0.96212121]
|
|
|
|
mean value: 0.8916966046361052
|
|
|
|
MCC on Blind test: 0.71
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18998051 0.18064475 0.18306255 0.18004155 0.18018317 0.17893577
|
|
0.17555499 0.179106 0.17361403 0.17628264]
|
|
|
|
mean value: 0.1797405958175659
|
|
|
|
key: score_time
|
|
value: [0.01593828 0.01705146 0.01724339 0.01699996 0.01718593 0.01622796
|
|
0.01678467 0.01644039 0.01647544 0.01549101]
|
|
|
|
mean value: 0.016583847999572753
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 0.96547546 0.96547546 0.96490128 0.96490128
|
|
1. 0.96490128 0.93094934 0.96490128]
|
|
|
|
mean value: 0.9687022616087002
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 0.98245614 0.98245614 0.98214286 0.98214286
|
|
1. 0.98214286 0.96428571 0.98214286]
|
|
|
|
mean value: 0.9840225563909775
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 0.98305085 0.98305085 0.98245614 0.98245614
|
|
1. 0.98245614 0.96551724 0.98245614]
|
|
|
|
mean value: 0.984389963804895
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.96551724 0.96551724
|
|
1. 0.96551724 0.93333333 0.96551724]
|
|
|
|
mean value: 0.9694252873563218
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 0.98214286 0.98214286 0.98214286 0.98214286
|
|
1. 0.98214286 0.96428571 0.98214286]
|
|
|
|
mean value: 0.9839901477832513
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 0.96666667 0.96666667 0.96551724 0.96551724
|
|
1. 0.96551724 0.93333333 0.96551724]
|
|
|
|
mean value: 0.9694252873563218
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06405163 0.07544374 0.07800102 0.0659802 0.08432174 0.09123611
|
|
0.07245827 0.06432152 0.07139039 0.0639205 ]
|
|
|
|
mean value: 0.07311251163482665
|
|
|
|
key: score_time
|
|
value: [0.02064395 0.02726412 0.02178264 0.02308631 0.03058195 0.0379889
|
|
0.01936054 0.02460885 0.0287199 0.02765727]
|
|
|
|
mean value: 0.026169443130493165
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.93202124 0.96547546 1. 0.89342711 0.96490128
|
|
0.92857143 0.96490128 0.93094934 0.93094934]
|
|
|
|
mean value: 0.9476713715472729
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99211042 1. 0.99214142 1. 1.
|
|
0.99212598 1. 0.99607071 0.99215674]
|
|
|
|
mean value: 0.996460528395497
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.98245614 1. 0.94642857 0.98214286
|
|
0.96428571 0.98214286 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9733395989974937
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99605523 1. 0.99605523 1. 1.
|
|
0.99606299 1. 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9982267933963875
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.96296296 0.98305085 1. 0.94736842 0.98245614
|
|
0.96428571 0.98245614 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9736070849570189
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99606299 1. 0.99606299 1. 1.
|
|
0.99606299 1. 0.99803536 0.99607843]
|
|
|
|
mean value: 0.9982302771208262
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 1. 0.93103448 0.96551724
|
|
0.96428571 0.96551724 0.93333333 0.93333333]
|
|
|
|
mean value: 0.96252052545156
|
|
|
|
key: train_precision
|
|
value: [1. 0.99606299 1. 0.99215686 1. 1.
|
|
0.99606299 1. 0.99607843 0.9921875 ]
|
|
|
|
mean value: 0.9972548778369615
|
|
|
|
key: test_recall
|
|
value: [1. 0.92857143 1. 1. 0.96428571 1.
|
|
0.96428571 1. 1. 1. ]
|
|
|
|
mean value: 0.9857142857142858
|
|
|
|
key: train_recall
|
|
value: [1. 0.99606299 1. 1. 1. 1.
|
|
0.99606299 1. 1. 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.96428571 0.98214286 1. 0.94642857 0.98214286
|
|
0.96428571 0.98214286 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9732758620689655
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99605521 1. 0.99606299 1. 1.
|
|
0.99606299 1. 0.9980315 0.99606299]
|
|
|
|
mean value: 0.9982275683918956
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.92857143 0.96666667 1. 0.9 0.96551724
|
|
0.93103448 0.96551724 0.93333333 0.93333333]
|
|
|
|
mean value: 0.9489490968801314
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99215686 1. 0.99215686 1. 1.
|
|
0.99215686 1. 0.99607843 0.9921875 ]
|
|
|
|
mean value: 0.9964736519607843
|
|
|
|
MCC on Blind test: 0.85
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17205048 0.19767261 0.18131852 0.16802168 0.2220397 0.16332126
|
|
0.26538157 0.2428298 0.2475462 0.21324587]
|
|
|
|
mean value: 0.2073427677154541
|
|
|
|
key: score_time
|
|
value: [0.03067493 0.02527881 0.0278089 0.01557803 0.02717137 0.0271771
|
|
0.02811146 0.03504801 0.04026318 0.02716613]
|
|
|
|
mean value: 0.028427791595458985
|
|
|
|
key: test_mcc
|
|
value: [0.77903565 0.8953202 0.96547546 0.80685836 0.76225171 0.85714286
|
|
0.8660254 0.93094934 0.89342711 0.77459667]
|
|
|
|
mean value: 0.8531082753337808
|
|
|
|
key: train_mcc
|
|
value: [0.98434291 0.98823457 0.98434388 0.98823511 0.99215674 0.99215674
|
|
0.98437404 0.98437404 0.99215674 0.98437404]
|
|
|
|
mean value: 0.9874748809761109
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.94736842 0.98245614 0.89473684 0.875 0.92857143
|
|
0.92857143 0.96428571 0.94642857 0.875 ]
|
|
|
|
mean value: 0.9219611528822055
|
|
|
|
key: train_accuracy
|
|
value: [0.99211045 0.99408284 0.99211045 0.99408284 0.99606299 0.99606299
|
|
0.99212598 0.99212598 0.99606299 0.99212598]
|
|
|
|
mean value: 0.9936953516905062
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.94736842 0.98305085 0.90625 0.8852459 0.92857143
|
|
0.93333333 0.96551724 0.94736842 0.88888889]
|
|
|
|
mean value: 0.9274483372264085
|
|
|
|
key: train_fscore
|
|
value: [0.9921875 0.99412916 0.99215686 0.99410609 0.99607843 0.99607843
|
|
0.9921875 0.9921875 0.99607843 0.9921875 ]
|
|
|
|
mean value: 0.9937377405748746
|
|
|
|
key: test_precision
|
|
value: [0.8 0.93103448 0.96666667 0.82857143 0.81818182 0.92857143
|
|
0.875 0.93333333 0.93103448 0.8 ]
|
|
|
|
mean value: 0.8812393640841917
|
|
|
|
key: train_precision
|
|
value: [0.98449612 0.98832685 0.9844358 0.98828125 0.9921875 0.9921875
|
|
0.98449612 0.98449612 0.9921875 0.98449612]
|
|
|
|
mean value: 0.9875590892038428
|
|
|
|
key: test_recall
|
|
value: [1. 0.96428571 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9821428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.87931034 0.9476601 0.98214286 0.89285714 0.875 0.92857143
|
|
0.92857143 0.96428571 0.94642857 0.875 ]
|
|
|
|
mean value: 0.9219827586206897
|
|
|
|
key: train_roc_auc
|
|
value: [0.99209486 0.99407115 0.99212598 0.99409449 0.99606299 0.99606299
|
|
0.99212598 0.99212598 0.99606299 0.99212598]
|
|
|
|
mean value: 0.9936953409479942
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.9 0.96666667 0.82857143 0.79411765 0.86666667
|
|
0.875 0.93333333 0.9 0.8 ]
|
|
|
|
mean value: 0.8664355742296919
|
|
|
|
key: train_jcc
|
|
value: [0.98449612 0.98832685 0.9844358 0.98828125 0.9921875 0.9921875
|
|
0.98449612 0.98449612 0.9921875 0.98449612]
|
|
|
|
mean value: 0.9875590892038428
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.66994047 0.66400075 0.67747116 0.74977636 0.67632365 0.645926
|
|
0.66423178 0.66457534 0.69733119 0.67097402]
|
|
|
|
mean value: 0.6780550718307495
|
|
|
|
key: score_time
|
|
value: [0.00979495 0.01027083 0.01547742 0.00971317 0.0095067 0.0092721
|
|
0.01001692 0.0094285 0.01009512 0.00964975]
|
|
|
|
mean value: 0.01032254695892334
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 0.96547546 1. 0.93094934 0.96490128
|
|
0.96490128 0.96490128 0.93094934 0.89802651]
|
|
|
|
mean value: 0.958562172459794
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 0.98245614 1. 0.96428571 0.98214286
|
|
0.98214286 0.98214286 0.96428571 0.94642857]
|
|
|
|
mean value: 0.9786340852130325
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 0.98305085 1. 0.96551724 0.98245614
|
|
0.98245614 0.98245614 0.96551724 0.94915254]
|
|
|
|
mean value: 0.9793062433992638
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 1. 0.96666667 1. 0.93333333 0.96551724
|
|
0.96551724 0.96551724 0.93333333 0.90322581]
|
|
|
|
mean value: 0.9598628105302188
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 0.98214286 1. 0.96428571 0.98214286
|
|
0.98214286 0.98214286 0.96428571 0.94642857]
|
|
|
|
mean value: 0.9786330049261084
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 0.96666667 1. 0.93333333 0.96551724
|
|
0.96551724 0.96551724 0.93333333 0.90322581]
|
|
|
|
mean value: 0.9598628105302188
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0364995 0.05236554 0.06891942 0.04245543 0.03577256 0.04559731
|
|
0.04729986 0.04501486 0.05125642 0.03263569]
|
|
|
|
mean value: 0.04578166007995606
|
|
|
|
key: score_time
|
|
value: [0.02079725 0.01780081 0.01469088 0.01432419 0.01472092 0.01861978
|
|
0.02223301 0.02089095 0.01528358 0.01549315]
|
|
|
|
mean value: 0.017485451698303223
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.96547546 1. 0.96547546 0.96490128 0.93094934
|
|
1. 1. 0.96490128 0.96490128]
|
|
|
|
mean value: 0.9688787292752474
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.98245614 1. 0.98245614 0.98214286 0.96428571
|
|
1. 1. 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9840538847117795
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.98181818 1. 0.98305085 0.98181818 0.96296296
|
|
1. 1. 0.98181818 0.98245614]
|
|
|
|
mean value: 0.9839441737605323
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 1. 0.96666667 1. 1.
|
|
1. 1. 1. 0.96551724]
|
|
|
|
mean value: 0.986551724137931
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96428571 1. 1. 0.96428571 0.92857143
|
|
1. 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9821428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.98214286 1. 0.98214286 0.98214286 0.96428571
|
|
1. 1. 0.98214286 0.98214286]
|
|
|
|
mean value: 0.9840517241379311
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.96428571 1. 0.96666667 0.96428571 0.92857143
|
|
1. 1. 0.96428571 0.96551724]
|
|
|
|
mean value: 0.9686945812807882
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.05
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03992987 0.04090619 0.02985525 0.03153992 0.01640797 0.01648879
|
|
0.02689624 0.02909088 0.0170362 0.01887155]
|
|
|
|
mean value: 0.026702284812927246
|
|
|
|
key: score_time
|
|
value: [0.02908468 0.02943969 0.02950621 0.03354526 0.01250839 0.01258111
|
|
0.02266932 0.02381325 0.01700115 0.0155673 ]
|
|
|
|
mean value: 0.022571635246276856
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.8615634 0.82512315 0.93202124 0.82195294 0.96490128
|
|
0.92857143 0.89342711 0.82195294 0.83484711]
|
|
|
|
mean value: 0.8784243193263463
|
|
|
|
key: train_mcc
|
|
value: [0.96450468 0.95667331 0.94872473 0.95661511 0.96853396 0.95278544
|
|
0.9606597 0.96062992 0.95670033 0.95670033]
|
|
|
|
mean value: 0.9582527517536223
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.9122807 0.96491228 0.91071429 0.98214286
|
|
0.96428571 0.94642857 0.91071429 0.91071429]
|
|
|
|
mean value: 0.937938596491228
|
|
|
|
key: train_accuracy
|
|
value: [0.98224852 0.97830375 0.97435897 0.97830375 0.98425197 0.97637795
|
|
0.98031496 0.98031496 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9791167746043579
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.92592593 0.9122807 0.96666667 0.9122807 0.98245614
|
|
0.96428571 0.94736842 0.90909091 0.91803279]
|
|
|
|
mean value: 0.9387540510139624
|
|
|
|
key: train_fscore
|
|
value: [0.98224852 0.97847358 0.97425743 0.97830375 0.98431373 0.97647059
|
|
0.98039216 0.98031496 0.97830375 0.978389 ]
|
|
|
|
mean value: 0.9791467451988494
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.96153846 0.92857143 0.93548387 0.89655172 0.96551724
|
|
0.96428571 0.93103448 0.92592593 0.84848485]
|
|
|
|
mean value: 0.9260619504501596
|
|
|
|
key: train_precision
|
|
value: [0.98418972 0.97276265 0.97619048 0.97637795 0.98046875 0.97265625
|
|
0.9765625 0.98031496 0.98023715 0.97647059]
|
|
|
|
mean value: 0.9776231001196349
|
|
|
|
key: test_recall
|
|
value: [1. 0.89285714 0.89655172 1. 0.92857143 1.
|
|
0.96428571 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9539408866995074
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98425197 0.97233202 0.98023715 0.98818898 0.98031496
|
|
0.98425197 0.98031496 0.97637795 0.98031496]
|
|
|
|
mean value: 0.9806899878621892
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.92918719 0.91256158 0.96428571 0.91071429 0.98214286
|
|
0.96428571 0.94642857 0.91071429 0.91071429]
|
|
|
|
mean value: 0.9379310344827587
|
|
|
|
key: train_roc_auc
|
|
value: [0.98225234 0.97829199 0.97435498 0.97830755 0.98425197 0.97637795
|
|
0.98031496 0.98031496 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9791159627773801
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.86206897 0.83870968 0.93548387 0.83870968 0.96551724
|
|
0.93103448 0.9 0.83333333 0.84848485]
|
|
|
|
mean value: 0.8856567903731418
|
|
|
|
key: train_jcc
|
|
value: [0.96511628 0.95785441 0.94980695 0.95752896 0.96911197 0.95402299
|
|
0.96153846 0.96138996 0.95752896 0.95769231]
|
|
|
|
mean value: 0.9591591238303347
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.45232368 0.31628871 0.2526288 0.39126348 0.50805736 0.3298955
|
|
0.37645435 0.22237539 0.32188296 0.34562302]
|
|
|
|
mean value: 0.35167932510375977
|
|
|
|
key: score_time
|
|
value: [0.01331973 0.02620149 0.01252651 0.02661324 0.02001739 0.02062368
|
|
0.02325225 0.03817391 0.01921773 0.02543306]
|
|
|
|
mean value: 0.022537899017333985
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.8615634 0.82512315 0.93202124 0.82195294 0.96490128
|
|
0.92857143 0.89342711 0.82195294 0.83484711]
|
|
|
|
mean value: 0.8784243193263463
|
|
|
|
key: train_mcc
|
|
value: [0.96450468 0.95667331 0.94872473 0.95661511 0.96853396 0.95278544
|
|
0.9606597 0.96062992 0.95670033 0.95670033]
|
|
|
|
mean value: 0.9582527517536223
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.9122807 0.96491228 0.91071429 0.98214286
|
|
0.96428571 0.94642857 0.91071429 0.91071429]
|
|
|
|
mean value: 0.937938596491228
|
|
|
|
key: train_accuracy
|
|
value: [0.98224852 0.97830375 0.97435897 0.97830375 0.98425197 0.97637795
|
|
0.98031496 0.98031496 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9791167746043579
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.92592593 0.9122807 0.96666667 0.9122807 0.98245614
|
|
0.96428571 0.94736842 0.90909091 0.91803279]
|
|
|
|
mean value: 0.9387540510139624
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:188: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_8020.py:191: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98224852 0.97847358 0.97425743 0.97830375 0.98431373 0.97647059
|
|
0.98039216 0.98031496 0.97830375 0.978389 ]
|
|
|
|
mean value: 0.9791467451988494
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.96153846 0.92857143 0.93548387 0.89655172 0.96551724
|
|
0.96428571 0.93103448 0.92592593 0.84848485]
|
|
|
|
mean value: 0.9260619504501596
|
|
|
|
key: train_precision
|
|
value: [0.98418972 0.97276265 0.97619048 0.97637795 0.98046875 0.97265625
|
|
0.9765625 0.98031496 0.98023715 0.97647059]
|
|
|
|
mean value: 0.9776231001196349
|
|
|
|
key: test_recall
|
|
value: [1. 0.89285714 0.89655172 1. 0.92857143 1.
|
|
0.96428571 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9539408866995074
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98425197 0.97233202 0.98023715 0.98818898 0.98031496
|
|
0.98425197 0.98031496 0.97637795 0.98031496]
|
|
|
|
mean value: 0.9806899878621892
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.92918719 0.91256158 0.96428571 0.91071429 0.98214286
|
|
0.96428571 0.94642857 0.91071429 0.91071429]
|
|
|
|
mean value: 0.9379310344827587
|
|
|
|
key: train_roc_auc
|
|
value: [0.98225234 0.97829199 0.97435498 0.97830755 0.98425197 0.97637795
|
|
0.98031496 0.98031496 0.97834646 0.97834646]
|
|
|
|
mean value: 0.9791159627773801
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.86206897 0.83870968 0.93548387 0.83870968 0.96551724
|
|
0.93103448 0.9 0.83333333 0.84848485]
|
|
|
|
mean value: 0.8856567903731418
|
|
|
|
key: train_jcc
|
|
value: [0.96511628 0.95785441 0.94980695 0.95752896 0.96911197 0.95402299
|
|
0.96153846 0.96138996 0.95752896 0.95769231]
|
|
|
|
mean value: 0.9591591238303347
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.91
|