/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_7030.py:548: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. from pandas import MultiIndex, Int64Index 1.22.4 1.4.1 aaindex_df contains non-numerical data Total no. of non-numerial columns: 2 Selecting numerical data only PASS: successfully selected numerical columns only for aaindex_df Now checking for NA in the remaining aaindex_cols Counting aaindex_df cols with NA ncols with NA: 4 columns Dropping these... Original ncols: 127 Revised df ncols: 123 Checking NA in revised df... PASS: cols with NA successfully dropped from aaindex_df Proceeding with combining aa_df with other features_df PASS: ncols match Expected ncols: 123 Got: 123 Total no. of columns in clean aa_df: 123 Proceeding to merge, expected nrows in merged_df: 1133 PASS: my_features_df and aa_df successfully combined nrows: 1133 ncols: 274 count of NULL values before imputation or_mychisq 339 log10_or_mychisq 339 dtype: int64 count of NULL values AFTER imputation mutationinformation 0 or_rawI 0 logorI 0 dtype: int64 PASS: OR values imputed, data ready for ML Total no. of features for aaindex: 123 No. of numerical features: 169 No. of categorical features: 7 PASS: x_features has no target variable No. of columns for x_features: 176 ------------------------------------------------------------- Successfully split data with stratification: 70/30 Input features data size: (557, 176) Train data size: (373, 176) Test data size: (184, 176) y_train numbers: Counter({0: 189, 1: 184}) y_train ratio: 1.0271739130434783 y_test_numbers: Counter({0: 93, 1: 91}) y_test ratio: 1.021978021978022 ------------------------------------------------------------- index: 0 ind: 1 Mask count check: True index: 1 ind: 2 Mask count check: True index: 2 ind: 3 Mask count check: True Original Data Counter({0: 189, 1: 184}) Data dim: (373, 176) Simple Random OverSampling Counter({1: 189, 0: 189}) (378, 176) Simple Random UnderSampling Counter({0: 184, 1: 184}) (368, 176) Simple Combined Over and UnderSampling Counter({0: 189, 1: 189}) (378, 176) SMOTE_NC OverSampling Counter({1: 189, 0: 189}) (378, 176) ##################################################################### Running ML analysis: 70/30 split Gene name: rpoB Drug name: rifampicin Output directory: /home/tanu/git/Data/rifampicin/output/ml/tts_7030/ Sanity checks: Total input features: 176 Training data size: (373, 176) Test data size: (184, 176) Target feature numbers (training data): Counter({0: 189, 1: 184}) Target features ratio (training data: 1.0271739130434783 Target feature numbers (test data): Counter({0: 93, 1: 91}) Target features ratio (test data): 1.021978021978022 ##################################################################### ================================================================ Strucutral features (n): 37 These are: Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist'] FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss'] Other struc columns: ['rsa', 'kd_values', 'rd_values'] ================================================================ AAindex features (n): 123 These are: ['ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'] ================================================================ Evolutionary features (n): 3 These are: ['consurf_score', 'snap2_score', 'provean_score'] ================================================================ Genomic features (n): 6 These are: ['maf', 'logorI'] ['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'] ================================================================ Categorical features (n): 7 These are: ['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'] ================================================================ Pass: No. of features match ##################################################################### Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.06178975 0.03049135 0.03225613 0.03168249 0.03428817 0.03472066 0.0346334 0.02963018 0.0661819 0.06939864] mean value: 0.042507266998291014 key: score_time value: [0.02421689 0.01209807 0.01218534 0.01204538 0.01496863 0.01508284 0.01212668 0.01215744 0.01237869 0.01236558] mean value: 0.013962554931640624 key: test_mcc value: [0.89973541 0.57894737 0.68803296 0.73099415 0.83918129 0.68035483 0.83918129 0.89181287 0.94736842 0.84834956] mean value: 0.7943958140147007 key: train_mcc value: [0.86265911 0.8687128 0.88086411 0.87498893 0.87500665 0.88101481 0.87500665 0.87500665 0.86910921 0.86324256] mean value: 0.8725611472917782 key: test_accuracy value: [0.94736842 0.78947368 0.84210526 0.86486486 0.91891892 0.83783784 0.91891892 0.94594595 0.97297297 0.91891892] mean value: 0.8957325746799432 key: train_accuracy value: [0.93134328 0.93432836 0.94029851 0.9375 0.9375 0.94047619 0.9375 0.9375 0.93452381 0.93154762] mean value: 0.936251776830135 key: test_fscore value: [0.95 0.78947368 0.83333333 0.86486486 0.91891892 0.84210526 0.91891892 0.94444444 0.97297297 0.90909091] mean value: 0.8944123309912784 key: train_fscore value: [0.93009119 0.93373494 0.94011976 0.93655589 0.93693694 0.94011976 0.93693694 0.93693694 0.93413174 0.93134328] mean value: 0.9356907368285973 key: test_precision value: [0.9047619 0.78947368 0.88235294 0.88888889 0.89473684 0.8 0.89473684 0.94444444 0.94736842 1. ] mean value: 0.8946763968745393 key: train_precision value: [0.93292683 0.92814371 0.92899408 0.93373494 0.93413174 0.93452381 0.93413174 0.93413174 0.92857143 0.92307692] mean value: 0.9312366935195415 key: test_recall value: [1. 0.78947368 0.78947368 0.84210526 0.94444444 0.88888889 0.94444444 0.94444444 1. 0.83333333] mean value: 0.8976608187134503 key: train_recall value: [0.92727273 0.93939394 0.95151515 0.93939394 0.93975904 0.94578313 0.93975904 0.93975904 0.93975904 0.93975904] mean value: 0.940215407082877 key: test_roc_auc value: [0.94736842 0.78947368 0.84210526 0.86549708 0.91959064 0.83918129 0.91959064 0.94590643 0.97368421 0.91666667] mean value: 0.895906432748538 key: train_roc_auc value: [0.93128342 0.93440285 0.94046346 0.93753323 0.93752658 0.94053863 0.93752658 0.93752658 0.9345854 0.93164422] mean value: 0.936303093978315 key: test_jcc value: [0.9047619 0.65217391 0.71428571 0.76190476 0.85 0.72727273 0.85 0.89473684 0.94736842 0.83333333] mean value: 0.8135837617759815 key: train_jcc value: [0.86931818 0.87570621 0.88700565 0.88068182 0.88135593 0.88700565 0.88135593 0.88135593 0.87640449 0.87150838] mean value: 0.8791698185004754 MCC on Blind test: 0.84 Accuracy on Blind test: 0.92 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [1.24725127 1.25279379 1.20516229 1.37949181 1.13146496 1.56774902 1.27143836 1.19698691 1.58842921 1.32122326] mean value: 1.3161990880966186 key: score_time value: [0.0150485 0.01297903 0.01233244 0.0124383 0.01545596 0.01286793 0.01544952 0.01620698 0.01299453 0.0154202 ] mean value: 0.014119338989257813 key: test_mcc value: [0.89973541 0.68803296 0.68803296 0.68035483 0.83918129 0.68035483 0.83918129 0.78764146 0.94736842 0.73020842] mean value: 0.7780091871199141 key: train_mcc value: [0.88065448 1. 0.83879937 0.90473153 0.89284196 0.83334517 0.88691246 1. 1. 0.98809355] mean value: 0.9225378515989221 key: test_accuracy value: [0.94736842 0.84210526 0.84210526 0.83783784 0.91891892 0.83783784 0.91891892 0.89189189 0.97297297 0.86486486] mean value: 0.8874822190611664 key: train_accuracy value: [0.94029851 1. 0.91940299 0.95238095 0.94642857 0.91666667 0.94345238 1. 1. 0.99404762] mean value: 0.9612677683013504 key: test_fscore value: [0.95 0.85 0.83333333 0.83333333 0.91891892 0.84210526 0.91891892 0.88235294 0.97297297 0.85714286] mean value: 0.88590785389547 key: train_fscore value: [0.93975904 1. 0.918429 0.95151515 0.94578313 0.91515152 0.94294294 1. 1. 0.9939759 ] mean value: 0.9607556684919915 key: test_precision value: [0.9047619 0.80952381 0.88235294 0.88235294 0.89473684 0.8 0.89473684 0.9375 0.94736842 0.88235294] mean value: 0.8835686643078284 key: train_precision value: [0.93413174 1. 0.91566265 0.95151515 0.94578313 0.92073171 0.94011976 1. 1. 0.9939759 ] mean value: 0.96019200425852 key: test_recall value: [1. 0.89473684 0.78947368 0.78947368 0.94444444 0.88888889 0.94444444 0.83333333 1. 0.83333333] mean value: 0.8918128654970761 key: train_recall value: [0.94545455 1. 0.92121212 0.95151515 0.94578313 0.90963855 0.94578313 1. 1. 0.9939759 ] mean value: 0.9613362541073385 key: test_roc_auc value: [0.94736842 0.84210526 0.84210526 0.83918129 0.91959064 0.83918129 0.91959064 0.89035088 0.97368421 0.86403509] mean value: 0.887719298245614 key: train_roc_auc value: [0.94037433 1. 0.91942959 0.95236576 0.94642098 0.91658398 0.9434798 1. 1. 0.99404678] mean value: 0.9612701222377078 key: test_jcc value: [0.9047619 0.73913043 0.71428571 0.71428571 0.85 0.72727273 0.85 0.78947368 0.94736842 0.75 ] mean value: 0.7986578600651827 key: train_jcc value: [0.88636364 1. 0.84916201 0.90751445 0.89714286 0.84357542 0.89204545 1. 1. 0.98802395] mean value: 0.9263827781182407 MCC on Blind test: 0.76 Accuracy on Blind test: 0.88 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01527596 0.01070619 0.01155663 0.01507545 0.0173049 0.00961065 0.00960159 0.01027203 0.01683927 0.00966716] mean value: 0.012590980529785157 key: score_time value: [0.01228333 0.00963926 0.01033974 0.01431751 0.01128507 0.00904703 0.00902438 0.01120901 0.01366067 0.00890756] mean value: 0.010971355438232421 key: test_mcc value: [0.74620251 0.42640143 0.61017022 0.47328975 0.57857577 0.53638795 0.83918129 0.73821295 0.73821295 0.51319869] mean value: 0.6199833494826468 key: train_mcc value: [0.63213973 0.67077671 0.66818514 0.67469654 0.63279874 0.65000993 0.65436967 0.65987564 0.64691443 0.63336739] mean value: 0.6523133932014681 key: test_accuracy value: [0.86842105 0.71052632 0.78947368 0.72972973 0.78378378 0.75675676 0.91891892 0.86486486 0.86486486 0.75675676] mean value: 0.8044096728307255 key: train_accuracy value: [0.81492537 0.83283582 0.83283582 0.83630952 0.80952381 0.82142857 0.82440476 0.82738095 0.82142857 0.81547619] mean value: 0.8236549395877754 key: test_fscore value: [0.85714286 0.68571429 0.75 0.70588235 0.75 0.7804878 0.91891892 0.84848485 0.84848485 0.74285714] mean value: 0.7887973059422126 key: train_fscore value: [0.80254777 0.81818182 0.82165605 0.82539683 0.78378378 0.80392157 0.80906149 0.81290323 0.80769231 0.80379747] mean value: 0.8088942308172258 key: test_precision value: [0.9375 0.75 0.92307692 0.8 0.85714286 0.69565217 0.89473684 0.93333333 0.93333333 0.76470588] mean value: 0.8489481345257694 key: train_precision value: [0.84563758 0.88111888 0.86577181 0.86666667 0.89230769 0.87857143 0.87412587 0.875 0.8630137 0.84666667] mean value: 0.8688880304060501 key: test_recall value: [0.78947368 0.63157895 0.63157895 0.63157895 0.66666667 0.88888889 0.94444444 0.77777778 0.77777778 0.72222222] mean value: 0.7461988304093568 key: train_recall value: [0.76363636 0.76363636 0.78181818 0.78787879 0.69879518 0.74096386 0.75301205 0.75903614 0.75903614 0.76506024] mean value: 0.7572873311427528 key: test_roc_auc value: [0.86842105 0.71052632 0.78947368 0.73245614 0.78070175 0.76023392 0.91959064 0.8625731 0.8625731 0.75584795] mean value: 0.8042397660818713 key: train_roc_auc value: [0.81417112 0.83181818 0.83208556 0.83545986 0.80822112 0.82048193 0.82356485 0.8265769 0.82069454 0.81488306] mean value: 0.8227957123550022 key: test_jcc value: [0.75 0.52173913 0.6 0.54545455 0.6 0.64 0.85 0.73684211 0.73684211 0.59090909] mean value: 0.6571786977324735 key: train_jcc value: [0.67021277 0.69230769 0.6972973 0.7027027 0.64444444 0.67213115 0.67934783 0.68478261 0.67741935 0.67195767] mean value: 0.6792603511829558 MCC on Blind test: 0.7 Accuracy on Blind test: 0.85 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01008868 0.0128963 0.02115774 0.00986743 0.00986052 0.01429033 0.01309276 0.01284504 0.00995731 0.01076698] mean value: 0.012482309341430664 key: score_time value: [0.00907111 0.01345086 0.00931144 0.00973558 0.00988913 0.0149653 0.01284695 0.00896931 0.0095892 0.0089035 ] mean value: 0.010673236846923829 key: test_mcc value: [0.74620251 0.42640143 0.65465367 0.56725146 0.68035483 0.56725146 0.56725146 0.67849265 0.7888597 0.83918129] mean value: 0.6515900457032924 key: train_mcc value: [0.73755882 0.7792393 0.74367201 0.73209888 0.70905196 0.7441844 0.76402212 0.73287373 0.70870914 0.71482244] mean value: 0.7366232811135077 key: test_accuracy value: [0.86842105 0.71052632 0.81578947 0.78378378 0.83783784 0.78378378 0.78378378 0.83783784 0.89189189 0.91891892] mean value: 0.8232574679943101 key: train_accuracy value: [0.86865672 0.88955224 0.87164179 0.86607143 0.85416667 0.87202381 0.88095238 0.86607143 0.85416667 0.85714286] mean value: 0.8680445984363895 key: test_fscore value: [0.87804878 0.73170732 0.78787879 0.78947368 0.84210526 0.77777778 0.77777778 0.82352941 0.89473684 0.91891892] mean value: 0.8221954561152628 key: train_fscore value: [0.86826347 0.88888889 0.87164179 0.86404834 0.85545723 0.87164179 0.88372093 0.86725664 0.85459941 0.85798817] mean value: 0.868350664914892 key: test_precision value: [0.81818182 0.68181818 0.92857143 0.78947368 0.8 0.77777778 0.77777778 0.875 0.85 0.89473684] mean value: 0.8193337510442774 key: train_precision value: [0.85798817 0.88095238 0.85882353 0.86144578 0.83815029 0.86390533 0.85393258 0.84971098 0.84210526 0.84302326] mean value: 0.8550037559538748 key: test_recall value: [0.94736842 0.78947368 0.68421053 0.78947368 0.88888889 0.77777778 0.77777778 0.77777778 0.94444444 0.94444444] mean value: 0.8321637426900584 key: train_recall value: [0.87878788 0.8969697 0.88484848 0.86666667 0.87349398 0.87951807 0.91566265 0.88554217 0.86746988 0.87349398] mean value: 0.8822453450164294 key: test_roc_auc value: [0.86842105 0.71052632 0.81578947 0.78362573 0.83918129 0.78362573 0.78362573 0.83625731 0.89327485 0.91959064] mean value: 0.8233918128654971 key: train_roc_auc value: [0.8688057 0.88966132 0.87183601 0.86608187 0.85439405 0.87211198 0.88136074 0.8663005 0.85432318 0.85733522] mean value: 0.8682210557211489 key: test_jcc value: [0.7826087 0.57692308 0.65 0.65217391 0.72727273 0.63636364 0.63636364 0.7 0.80952381 0.85 ] mean value: 0.7021229495142538 key: train_jcc value: [0.76719577 0.8 0.77248677 0.7606383 0.74742268 0.77248677 0.79166667 0.765625 0.74611399 0.75129534] mean value: 0.7674931283545561 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00998735 0.01429224 0.01066971 0.01056051 0.01671553 0.00953698 0.01070189 0.0094161 0.0092299 0.01051044] mean value: 0.011162066459655761 key: score_time value: [0.06381631 0.04476571 0.01716495 0.01765943 0.0184772 0.0145154 0.01202798 0.0113976 0.01206398 0.01199079] mean value: 0.022387933731079102 key: test_mcc value: [0.42163702 0.15877684 0.63960215 0.24633537 0.7888597 0.08554907 0.4163404 0.30307132 0.45906433 0.62170355] mean value: 0.41409397538354425 key: train_mcc value: [0.64781471 0.70758921 0.65960709 0.67249172 0.63089248 0.68489413 0.65480084 0.69054046 0.66065385 0.65480084] mean value: 0.6664085336282927 key: test_accuracy value: [0.71052632 0.57894737 0.81578947 0.62162162 0.89189189 0.54054054 0.7027027 0.64864865 0.72972973 0.81081081] mean value: 0.7051209103840683 key: train_accuracy value: [0.8238806 0.85373134 0.82985075 0.83630952 0.81547619 0.8422619 0.82738095 0.8452381 0.83035714 0.82738095] mean value: 0.8331867448471926 key: test_fscore value: [0.7027027 0.6 0.8 0.61111111 0.89473684 0.56410256 0.64516129 0.58064516 0.72222222 0.8 ] mean value: 0.6920681893856767 key: train_fscore value: [0.81846154 0.85285285 0.82674772 0.83282675 0.81212121 0.84272997 0.82317073 0.84146341 0.82779456 0.82317073] mean value: 0.8301339481829435 key: test_precision value: [0.72222222 0.57142857 0.875 0.64705882 0.85 0.52380952 0.76923077 0.69230769 0.72222222 0.82352941] mean value: 0.7196809236515119 key: train_precision value: [0.83125 0.8452381 0.82926829 0.83536585 0.81707317 0.83040936 0.83333333 0.85185185 0.83030303 0.83333333] mean value: 0.8337426317857961 key: test_recall value: [0.68421053 0.63157895 0.73684211 0.57894737 0.94444444 0.61111111 0.55555556 0.5 0.72222222 0.77777778] mean value: 0.6742690058479532 key: train_recall value: [0.80606061 0.86060606 0.82424242 0.83030303 0.80722892 0.85542169 0.81325301 0.8313253 0.8253012 0.81325301] mean value: 0.8266995253742242 key: test_roc_auc value: [0.71052632 0.57894737 0.81578947 0.62280702 0.89327485 0.54239766 0.69883041 0.64473684 0.72953216 0.80994152] mean value: 0.7046783625730995 key: train_roc_auc value: [0.82361854 0.85383244 0.82976827 0.83620415 0.81537916 0.84241673 0.82721474 0.84507442 0.83029766 0.82721474] mean value: 0.8331020846685362 key: test_jcc value: [0.54166667 0.42857143 0.66666667 0.44 0.80952381 0.39285714 0.47619048 0.40909091 0.56521739 0.66666667] mean value: 0.5396451157538114 key: train_jcc value: [0.69270833 0.7434555 0.70466321 0.71354167 0.68367347 0.72820513 0.69948187 0.72631579 0.70618557 0.69948187] mean value: 0.7097712394464257 MCC on Blind test: 0.49 Accuracy on Blind test: 0.74 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01723957 0.01586294 0.01570201 0.01627302 0.01574993 0.0172441 0.01658297 0.01831293 0.01824284 0.01881599] mean value: 0.017002630233764648 key: score_time value: [0.0108459 0.01086402 0.010607 0.01053047 0.01058245 0.01136756 0.01049376 0.01160789 0.01160383 0.01154995] mean value: 0.011005282402038574 key: test_mcc value: [0.89473684 0.47633051 0.63960215 0.7888597 0.84959079 0.68035483 0.78362573 0.83871328 1. 0.83871328] mean value: 0.7790527120618065 key: train_mcc value: [0.79118098 0.83279857 0.81532977 0.80977356 0.78568391 0.80949681 0.80371348 0.8097803 0.79787385 0.77976011] mean value: 0.8035391354542314 key: test_accuracy value: [0.94736842 0.73684211 0.81578947 0.89189189 0.91891892 0.83783784 0.89189189 0.91891892 1. 0.91891892] mean value: 0.8878378378378379 key: train_accuracy value: [0.89552239 0.91641791 0.90746269 0.9047619 0.89285714 0.9047619 0.90178571 0.9047619 0.89880952 0.88988095] mean value: 0.9017022032693675 key: test_fscore value: [0.94736842 0.75 0.8 0.88888889 0.92307692 0.84210526 0.88888889 0.91428571 1. 0.91428571] mean value: 0.8868899813636656 key: train_fscore value: [0.89489489 0.91515152 0.90746269 0.90419162 0.89156627 0.90361446 0.90149254 0.9047619 0.89880952 0.88888889] mean value: 0.9010834291045358 key: test_precision value: [0.94736842 0.71428571 0.875 0.94117647 0.85714286 0.8 0.88888889 0.94117647 1. 0.94117647] mean value: 0.8906215293134798 key: train_precision value: [0.88690476 0.91515152 0.89411765 0.89349112 0.89156627 0.90361446 0.89349112 0.89411765 0.88823529 0.88622754] mean value: 0.8946917381614027 key: test_recall value: [0.94736842 0.78947368 0.73684211 0.84210526 1. 0.88888889 0.88888889 0.88888889 1. 0.88888889] mean value: 0.8871345029239766 key: train_recall value: [0.9030303 0.91515152 0.92121212 0.91515152 0.89156627 0.90361446 0.90963855 0.91566265 0.90963855 0.89156627] mean value: 0.9076232201533406 key: test_roc_auc value: [0.94736842 0.73684211 0.81578947 0.89327485 0.92105263 0.83918129 0.89181287 0.91812865 1. 0.91812865] mean value: 0.8881578947368421 key: train_roc_auc value: [0.8956328 0.91639929 0.90766488 0.90494418 0.89284196 0.90474841 0.9018781 0.90489015 0.89893692 0.88990078] mean value: 0.9017837462995806 key: test_jcc value: [0.9 0.6 0.66666667 0.8 0.85714286 0.72727273 0.8 0.84210526 1. 0.84210526] mean value: 0.8035292777398041 key: train_jcc value: [0.80978261 0.84357542 0.83060109 0.82513661 0.80434783 0.82417582 0.82065217 0.82608696 0.81621622 0.8 ] mean value: 0.8200574729521878 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [2.2372601 2.04340816 1.6573236 1.768543 1.91152811 1.71026349 1.82718539 2.05708623 2.5108068 2.36679077] mean value: 2.0090195655822756 key: score_time value: [0.01652408 0.01711798 0.01492739 0.02228522 0.02000928 0.01484942 0.02607751 0.01266646 0.02013946 0.01371574] mean value: 0.017831254005432128 key: test_mcc value: [0.89973541 0.58218174 0.63960215 0.74044197 0.7888597 0.6754386 0.80369958 0.78362573 0.94736842 0.73020842] mean value: 0.7591161713668702 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.78947368 0.81578947 0.86486486 0.89189189 0.83783784 0.89189189 0.89189189 0.97297297 0.86486486] mean value: 0.8768847795163585 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94444444 0.8 0.8 0.85714286 0.89473684 0.83333333 0.9 0.88888889 0.97297297 0.85714286] mean value: 0.8748662196030617 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.76190476 0.875 0.9375 0.85 0.83333333 0.81818182 0.88888889 0.94736842 0.88235294] mean value: 0.8794530164537905 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.89473684 0.84210526 0.73684211 0.78947368 0.94444444 0.83333333 1. 0.88888889 1. 0.83333333] mean value: 0.8763157894736842 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.78947368 0.81578947 0.86695906 0.89327485 0.8377193 0.89473684 0.89181287 0.97368421 0.86403509] mean value: 0.8774853801169591 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.89473684 0.66666667 0.66666667 0.75 0.80952381 0.71428571 0.81818182 0.8 0.94736842 0.75 ] mean value: 0.781742993848257 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02188754 0.01587939 0.01625609 0.01523471 0.01606822 0.01897764 0.02061105 0.01994038 0.01674199 0.01598382] mean value: 0.01775808334350586 key: score_time value: [0.01280594 0.00951886 0.00900126 0.00924301 0.01010299 0.01299381 0.01148295 0.0133841 0.0088644 0.00901771] mean value: 0.01064150333404541 key: test_mcc value: [0.9486833 0.79388419 0.89973541 0.83918129 0.7888597 0.83918129 0.94736842 0.89181287 1. 0.83918129] mean value: 0.8787887738162246 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.89473684 0.94736842 0.91891892 0.89189189 0.91891892 0.97297297 0.94594595 1. 0.91891892] mean value: 0.9383357041251779 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.88888889 0.94444444 0.91891892 0.89473684 0.91891892 0.97297297 0.94444444 1. 0.91891892] mean value: 0.9376603323971745 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94117647 1. 0.94444444 0.85 0.89473684 0.94736842 0.94444444 1. 0.89473684] mean value: 0.9366907464740282 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.84210526 0.89473684 0.89473684 0.94444444 0.94444444 1. 0.94444444 1. 0.94444444] mean value: 0.9409356725146198 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.89473684 0.94736842 0.91959064 0.89327485 0.91959064 0.97368421 0.94590643 1. 0.91959064] mean value: 0.9387426900584795 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.8 0.89473684 0.85 0.80952381 0.85 0.94736842 0.89473684 1. 0.85 ] mean value: 0.8846365914786968 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.89 Accuracy on Blind test: 0.95 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.10830688 0.10529113 0.10499811 0.10781646 0.10907078 0.10594296 0.10623217 0.10774302 0.12120008 0.11080527] mean value: 0.10874068737030029 key: score_time value: [0.01754594 0.01745701 0.01749945 0.01769519 0.01761103 0.01773334 0.01761365 0.01932955 0.01826262 0.0196104 ] mean value: 0.01803581714630127 key: test_mcc value: [1. 0.52704628 0.63960215 0.7888597 0.7888597 0.56725146 0.6754386 0.83918129 1. 0.84834956] mean value: 0.7674588721170138 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.76315789 0.81578947 0.89189189 0.89189189 0.78378378 0.83783784 0.91891892 1. 0.91891892] mean value: 0.8822190611664296 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.76923077 0.8 0.88888889 0.89473684 0.77777778 0.83333333 0.91891892 1. 0.90909091] mean value: 0.879197743934586 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.75 0.875 0.94117647 0.85 0.77777778 0.83333333 0.89473684 1. 1. ] mean value: 0.892202442380461 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.78947368 0.73684211 0.84210526 0.94444444 0.77777778 0.83333333 0.94444444 1. 0.83333333] mean value: 0.8701754385964913 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.76315789 0.81578947 0.89327485 0.89327485 0.78362573 0.8377193 0.91959064 1. 0.91666667] mean value: 0.8823099415204678 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.625 0.66666667 0.8 0.80952381 0.63636364 0.71428571 0.85 1. 0.83333333] mean value: 0.793517316017316 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.82 Accuracy on Blind test: 0.91 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01310754 0.01101828 0.01016736 0.01003575 0.01700163 0.01136065 0.01075435 0.01067233 0.01063561 0.0164597 ] mean value: 0.012121319770812988 key: score_time value: [0.00934529 0.00985003 0.00910783 0.01338744 0.01081181 0.0099082 0.00986314 0.00971031 0.01279759 0.01062942] mean value: 0.010541105270385742 key: test_mcc value: [0.73786479 0.32732684 0.42163702 0.56725146 0.41299552 0.24269006 0.35087719 0.73099415 0.35104619 0.52214434] mean value: 0.466482756739817 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.86842105 0.65789474 0.71052632 0.78378378 0.7027027 0.62162162 0.67567568 0.86486486 0.67567568 0.75675676] mean value: 0.731792318634424 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.87179487 0.69767442 0.71794872 0.78947368 0.71794872 0.61111111 0.66666667 0.86486486 0.64705882 0.76923077] mean value: 0.7353772645910309 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85 0.625 0.7 0.78947368 0.66666667 0.61111111 0.66666667 0.84210526 0.6875 0.71428571] mean value: 0.715280910609858 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.89473684 0.78947368 0.73684211 0.78947368 0.77777778 0.61111111 0.66666667 0.88888889 0.61111111 0.83333333] mean value: 0.7599415204678363 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.86842105 0.65789474 0.71052632 0.78362573 0.70467836 0.62134503 0.6754386 0.86549708 0.67397661 0.75877193] mean value: 0.7320175438596491 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.77272727 0.53571429 0.56 0.65217391 0.56 0.44 0.5 0.76190476 0.47826087 0.625 ] mean value: 0.5885781102955016 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.37 Accuracy on Blind test: 0.68 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.74289036 1.53044105 1.49827647 1.53210521 1.52824092 1.50557423 1.51571012 1.51636791 1.51605439 1.5078156 ] mean value: 1.5393476247787476 key: score_time value: [0.09816289 0.09600377 0.09313393 0.09704614 0.09628916 0.09311295 0.09352589 0.09830904 0.09392619 0.09237742] mean value: 0.09518873691558838 key: test_mcc value: [0.9486833 0.79388419 0.80757285 0.94736842 0.89736456 0.89181287 0.94736842 0.83918129 1. 1. ] mean value: 0.9073235893751431 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.89473684 0.89473684 0.97297297 0.94594595 0.94594595 0.97297297 0.91891892 1. 1. ] mean value: 0.95199146514936 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.88888889 0.88235294 0.97297297 0.94736842 0.94444444 0.97297297 0.91891892 1. 1. ] mean value: 0.9502278534786275 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94117647 1. 1. 0.9 0.94444444 0.94736842 0.89473684 1. 1. ] mean value: 0.9577726178190574 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.84210526 0.78947368 0.94736842 1. 0.94444444 1. 0.94444444 1. 1. ] mean value: 0.9467836257309942 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.89473684 0.89473684 0.97368421 0.94736842 0.94590643 0.97368421 0.91959064 1. 1. ] mean value: 0.9523391812865497 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.8 0.78947368 0.94736842 0.9 0.89473684 0.94736842 0.85 1. 1. ] mean value: 0.9078947368421053 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.89 Accuracy on Blind test: 0.95 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: fit_time value: [1.79304814 0.91387272 1.02754641 0.95623207 0.91034627 0.89128447 0.93726492 0.88634753 0.91595483 0.94292164] mean value: 1.0174818992614747 key: score_time value: [0.26749158 0.23623896 0.25596142 0.2053082 0.22877908 0.22374535 0.21946526 0.25390697 0.22047591 0.22399354] mean value: 0.23353662490844726 key: test_mcc value: [1. 0.68803296 0.76376262 0.84959079 0.89736456 0.7888597 0.89181287 0.83918129 1. 0.94721815] mean value: 0.8665822928273674 key: train_mcc value: [0.94639427 0.97016256 0.96423353 0.96427432 0.96434396 0.95243498 0.95243498 0.96428065 0.94656062 0.95834146] mean value: 0.9583461336790984 key: test_accuracy value: [1. 0.84210526 0.86842105 0.91891892 0.94594595 0.89189189 0.94594595 0.91891892 1. 0.97297297] mean value: 0.9305120910384068 key: train_accuracy value: [0.97313433 0.98507463 0.98208955 0.98214286 0.98214286 0.97619048 0.97619048 0.98214286 0.97321429 0.97916667] mean value: 0.9791488983653163 key: test_fscore value: [1. 0.83333333 0.84848485 0.91428571 0.94736842 0.89473684 0.94444444 0.91891892 1. 0.97142857] mean value: 0.9273001094053726 key: train_fscore value: [0.97247706 0.98489426 0.98170732 0.98181818 0.98181818 0.97575758 0.97575758 0.98192771 0.97264438 0.97885196] mean value: 0.9787654207752894 key: test_precision value: [1. 0.88235294 1. 1. 0.9 0.85 0.94444444 0.89473684 1. 1. ] mean value: 0.9471534227726178 key: train_precision value: [0.98148148 0.98192771 0.98773006 0.98181818 0.98780488 0.98170732 0.98170732 0.98192771 0.98159509 0.98181818] mean value: 0.9829517932373947 key: test_recall value: [1. 0.78947368 0.73684211 0.84210526 1. 0.94444444 0.94444444 0.94444444 1. 0.94444444] mean value: 0.9146198830409357 key: train_recall value: [0.96363636 0.98787879 0.97575758 0.98181818 0.97590361 0.96987952 0.96987952 0.98192771 0.96385542 0.97590361] mean value: 0.974644030668127 key: test_roc_auc value: [1. 0.84210526 0.86842105 0.92105263 0.94736842 0.89327485 0.94590643 0.91959064 1. 0.97222222] mean value: 0.9309941520467836 key: train_roc_auc value: [0.97299465 0.98511586 0.98199643 0.98213716 0.98206945 0.97611623 0.97611623 0.98214033 0.97310418 0.97912828] mean value: 0.9790918811751368 key: test_jcc value: [1. 0.71428571 0.73684211 0.84210526 0.9 0.80952381 0.89473684 0.85 1. 0.94444444] mean value: 0.8691938178780284 key: train_jcc value: [0.94642857 0.9702381 0.96407186 0.96428571 0.96428571 0.95266272 0.95266272 0.96449704 0.94674556 0.95857988] mean value: 0.9584457880519603 MCC on Blind test: 0.85 Accuracy on Blind test: 0.92 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02368402 0.00988269 0.00996494 0.01014757 0.00983524 0.0103724 0.00982785 0.00976944 0.010041 0.01028967] mean value: 0.01138148307800293 key: score_time value: [0.01074362 0.00889611 0.06786227 0.00892687 0.00902653 0.00950575 0.00921893 0.00906777 0.00902295 0.00916648] mean value: 0.015143728256225586 key: test_mcc value: [0.74620251 0.42640143 0.65465367 0.56725146 0.68035483 0.56725146 0.56725146 0.67849265 0.7888597 0.83918129] mean value: 0.6515900457032924 key: train_mcc value: [0.73755882 0.7792393 0.74367201 0.73209888 0.70905196 0.7441844 0.76402212 0.73287373 0.70870914 0.71482244] mean value: 0.7366232811135077 key: test_accuracy value: [0.86842105 0.71052632 0.81578947 0.78378378 0.83783784 0.78378378 0.78378378 0.83783784 0.89189189 0.91891892] mean value: 0.8232574679943101 key: train_accuracy value: [0.86865672 0.88955224 0.87164179 0.86607143 0.85416667 0.87202381 0.88095238 0.86607143 0.85416667 0.85714286] mean value: 0.8680445984363895 key: test_fscore value: [0.87804878 0.73170732 0.78787879 0.78947368 0.84210526 0.77777778 0.77777778 0.82352941 0.89473684 0.91891892] mean value: 0.8221954561152628 key: train_fscore value: [0.86826347 0.88888889 0.87164179 0.86404834 0.85545723 0.87164179 0.88372093 0.86725664 0.85459941 0.85798817] mean value: 0.868350664914892 key: test_precision value: [0.81818182 0.68181818 0.92857143 0.78947368 0.8 0.77777778 0.77777778 0.875 0.85 0.89473684] mean value: 0.8193337510442774 key: train_precision value: [0.85798817 0.88095238 0.85882353 0.86144578 0.83815029 0.86390533 0.85393258 0.84971098 0.84210526 0.84302326] mean value: 0.8550037559538748 key: test_recall value: [0.94736842 0.78947368 0.68421053 0.78947368 0.88888889 0.77777778 0.77777778 0.77777778 0.94444444 0.94444444] mean value: 0.8321637426900584 key: train_recall value: [0.87878788 0.8969697 0.88484848 0.86666667 0.87349398 0.87951807 0.91566265 0.88554217 0.86746988 0.87349398] mean value: 0.8822453450164294 key: test_roc_auc value: [0.86842105 0.71052632 0.81578947 0.78362573 0.83918129 0.78362573 0.78362573 0.83625731 0.89327485 0.91959064] mean value: 0.8233918128654971 key: train_roc_auc value: [0.8688057 0.88966132 0.87183601 0.86608187 0.85439405 0.87211198 0.88136074 0.8663005 0.85432318 0.85733522] mean value: 0.8682210557211489 key: test_jcc value: [0.7826087 0.57692308 0.65 0.65217391 0.72727273 0.63636364 0.63636364 0.7 0.80952381 0.85 ] mean value: 0.7021229495142538 key: train_jcc value: [0.76719577 0.8 0.77248677 0.7606383 0.74742268 0.77248677 0.79166667 0.765625 0.74611399 0.75129534] mean value: 0.7674931283545561 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.24965024 0.05168438 0.06280875 0.06240273 0.07617354 0.05706716 0.05821943 0.06454206 0.06267881 0.09799004] mean value: 0.0843217134475708 key: score_time value: [0.01142573 0.01089931 0.01093245 0.01113415 0.01153183 0.01123571 0.01077247 0.01092672 0.01053596 0.01156497] mean value: 0.011095929145812988 key: test_mcc value: [0.9486833 0.84327404 0.89973541 0.83918129 0.94736842 0.94736842 0.94736842 0.89181287 1. 0.94736842] mean value: 0.9212160587861828 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.92105263 0.94736842 0.91891892 0.97297297 0.97297297 0.97297297 0.94594595 1. 0.97297297] mean value: 0.9598862019914651 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.91891892 0.94444444 0.91891892 0.97297297 0.97297297 0.97297297 0.94444444 1. 0.97297297] mean value: 0.9592977592977593 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94444444 1. 0.94444444 0.94736842 0.94736842 0.94736842 0.94444444 1. 0.94736842] mean value: 0.957280701754386 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.89473684 0.89473684 0.89473684 1. 1. 1. 0.94444444 1. 1. ] mean value: 0.9628654970760234 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.92105263 0.94736842 0.91959064 0.97368421 0.97368421 0.97368421 0.94590643 1. 0.97368421] mean value: 0.960233918128655 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.85 0.89473684 0.85 0.94736842 0.94736842 0.94736842 0.89473684 1. 0.94736842] mean value: 0.9228947368421052 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.03995848 0.03733993 0.03714538 0.03793836 0.08239031 0.03703666 0.03627944 0.07113647 0.07073593 0.04976606] mean value: 0.049972701072692874 key: score_time value: [0.01242304 0.01249933 0.01246953 0.01262975 0.01240516 0.01247954 0.01251364 0.02486706 0.02162528 0.01249456] mean value: 0.0146406888961792 key: test_mcc value: [0.89473684 0.53300179 0.59222009 0.56725146 0.89736456 0.68035483 0.94736842 0.89679028 0.84834956 0.62170355] mean value: 0.7479141390576655 key: train_mcc value: [0.95248307 0.95222816 0.95822045 0.94051126 0.940526 0.9285613 0.9285613 0.9523742 0.94643395 0.94656062] mean value: 0.9446460333429583 key: test_accuracy value: [0.94736842 0.76315789 0.78947368 0.78378378 0.94594595 0.83783784 0.97297297 0.94594595 0.91891892 0.81081081] mean value: 0.8716216216216216 key: train_accuracy value: [0.9761194 0.9761194 0.97910448 0.9702381 0.9702381 0.96428571 0.96428571 0.97619048 0.97321429 0.97321429] mean value: 0.9723009950248757 key: test_fscore value: [0.94736842 0.7804878 0.76470588 0.78947368 0.94736842 0.84210526 0.97297297 0.94117647 0.90909091 0.8 ] mean value: 0.8694749829356792 key: train_fscore value: [0.97546012 0.97575758 0.97885196 0.9695122 0.96969697 0.96385542 0.96385542 0.97590361 0.97280967 0.97264438] mean value: 0.9718347329426844 key: test_precision value: [0.94736842 0.72727273 0.86666667 0.78947368 0.9 0.8 0.94736842 1. 1. 0.82352941] mean value: 0.8801679332019889 key: train_precision value: [0.98757764 0.97575758 0.97590361 0.97546012 0.97560976 0.96385542 0.96385542 0.97590361 0.97575758 0.98159509] mean value: 0.9751275834377349 key: test_recall value: [0.94736842 0.84210526 0.68421053 0.78947368 1. 0.88888889 1. 0.88888889 0.83333333 0.77777778] mean value: 0.8652046783625731 key: train_recall value: [0.96363636 0.97575758 0.98181818 0.96363636 0.96385542 0.96385542 0.96385542 0.97590361 0.96987952 0.96385542] mean value: 0.9686053304125594 key: test_roc_auc value: [0.94736842 0.76315789 0.78947368 0.78362573 0.94736842 0.83918129 0.97368421 0.94444444 0.91666667 0.80994152] mean value: 0.8714912280701754 key: train_roc_auc value: [0.97593583 0.97611408 0.97914439 0.97012228 0.970163 0.96428065 0.96428065 0.9761871 0.97317505 0.97310418] mean value: 0.9722507216218284 key: test_jcc value: [0.9 0.64 0.61904762 0.65217391 0.9 0.72727273 0.94736842 0.88888889 0.83333333 0.66666667] mean value: 0.7774751569305345 key: train_jcc value: [0.95209581 0.95266272 0.95857988 0.9408284 0.94117647 0.93023256 0.93023256 0.95294118 0.94705882 0.94674556] mean value: 0.9452553963297876 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02484918 0.00989151 0.01043844 0.01072907 0.01075554 0.01006603 0.00961328 0.0097363 0.00997925 0.00970602] mean value: 0.011576461791992187 key: score_time value: [0.01031947 0.00934625 0.00986648 0.00878239 0.00917149 0.00902915 0.00898862 0.00896931 0.00876212 0.0088017 ] mean value: 0.009203696250915527 key: test_mcc value: [0.78947368 0.31622777 0.61017022 0.63129316 0.67849265 0.63129316 0.73020842 0.73821295 0.62170355 0.78362573] mean value: 0.6530701274630709 key: train_mcc value: [0.67195163 0.71367434 0.64874079 0.72056751 0.63176039 0.66756867 0.75591389 0.64914987 0.66690353 0.71425535] mean value: 0.6840485961335292 key: test_accuracy value: [0.89473684 0.65789474 0.78947368 0.81081081 0.83783784 0.81081081 0.86486486 0.86486486 0.81081081 0.89189189] mean value: 0.8233997155049787 key: train_accuracy value: [0.8358209 0.85671642 0.8238806 0.86011905 0.81547619 0.83333333 0.87797619 0.82440476 0.83333333 0.85714286] mean value: 0.8418203624733476 key: test_fscore value: [0.89473684 0.64864865 0.75 0.8 0.82352941 0.82051282 0.85714286 0.84848485 0.8 0.88888889] mean value: 0.8131944317548032 key: train_fscore value: [0.82972136 0.85185185 0.81504702 0.85448916 0.80745342 0.82608696 0.87613293 0.81846154 0.82822086 0.85454545] mean value: 0.8362010555198316 key: test_precision value: [0.89473684 0.66666667 0.92307692 0.875 0.875 0.76190476 0.88235294 0.93333333 0.82352941 0.88888889] mean value: 0.8524489768917013 key: train_precision value: [0.84810127 0.86792453 0.84415584 0.87341772 0.83333333 0.8525641 0.87878788 0.83647799 0.84375 0.8597561 ] mean value: 0.8538268759467177 key: test_recall value: [0.89473684 0.63157895 0.63157895 0.73684211 0.77777778 0.88888889 0.83333333 0.77777778 0.77777778 0.88888889] mean value: 0.7839181286549708 key: train_recall value: [0.81212121 0.83636364 0.78787879 0.83636364 0.78313253 0.80120482 0.87349398 0.80120482 0.81325301 0.84939759] mean value: 0.8194414019715224 key: test_roc_auc value: [0.89473684 0.65789474 0.78947368 0.8128655 0.83625731 0.8128655 0.86403509 0.8625731 0.80994152 0.89181287] mean value: 0.8232456140350878 key: train_roc_auc value: [0.83547237 0.85641711 0.82335116 0.85970229 0.81509568 0.83295535 0.87792346 0.82413182 0.83309709 0.85705174] mean value: 0.8415198065929164 key: test_jcc value: [0.80952381 0.48 0.6 0.66666667 0.7 0.69565217 0.75 0.73684211 0.66666667 0.8 ] mean value: 0.6905351422033345 key: train_jcc value: [0.70899471 0.74193548 0.68783069 0.74594595 0.67708333 0.7037037 0.77956989 0.69270833 0.70680628 0.74603175] mean value: 0.7190610118240058 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01237154 0.01653957 0.0177362 0.02279115 0.01850939 0.01929617 0.01568437 0.01943159 0.01927876 0.01795244] mean value: 0.017959117889404297 key: score_time value: [0.00886655 0.01132083 0.01133919 0.01192641 0.01196933 0.01196837 0.01189017 0.01204634 0.01206517 0.01196551] mean value: 0.01153578758239746 key: test_mcc value: [0.89473684 0.63960215 0.4330127 0.74044197 0.78362573 0.63129316 0.7888597 0.83918129 0.84959079 0.84834956] mean value: 0.7448693884912015 key: train_mcc value: [0.88773584 0.90033348 0.49718308 0.92909689 0.85527622 0.91673163 0.86308142 0.91673163 0.85029687 0.87836587] mean value: 0.8494832930737864 key: test_accuracy value: [0.94736842 0.81578947 0.65789474 0.86486486 0.89189189 0.81081081 0.89189189 0.91891892 0.91891892 0.91891892] mean value: 0.8637268847795164 key: train_accuracy value: [0.94328358 0.94925373 0.69552239 0.96428571 0.92261905 0.95833333 0.93154762 0.95833333 0.92261905 0.9375 ] mean value: 0.9183297796730633 key: test_fscore value: [0.94736842 0.8 0.74509804 0.85714286 0.88888889 0.82051282 0.89473684 0.91891892 0.92307692 0.90909091] mean value: 0.8704834620004899 key: train_fscore value: [0.94080997 0.94670846 0.76388889 0.96296296 0.91503268 0.95808383 0.9305136 0.95808383 0.92571429 0.93375394] mean value: 0.9235552453156383 key: test_precision value: [0.94736842 0.875 0.59375 0.9375 0.88888889 0.76190476 0.85 0.89473684 0.85714286 1. ] mean value: 0.8606291771094402 key: train_precision value: [0.96794872 0.98051948 0.61797753 0.98113208 1. 0.95238095 0.93333333 0.95238095 0.88043478 0.98013245] mean value: 0.9246240273064844 key: test_recall value: [0.94736842 0.73684211 1. 0.78947368 0.88888889 0.88888889 0.94444444 0.94444444 1. 0.83333333] mean value: 0.8973684210526316 key: train_recall value: [0.91515152 0.91515152 1. 0.94545455 0.84337349 0.96385542 0.92771084 0.96385542 0.97590361 0.89156627] mean value: 0.9342022635998539 key: test_roc_auc value: [0.94736842 0.81578947 0.65789474 0.86695906 0.89181287 0.8128655 0.89327485 0.91959064 0.92105263 0.91666667] mean value: 0.8643274853801169 key: train_roc_auc value: [0.94286988 0.94875223 0.7 0.96395534 0.92168675 0.9583983 0.93150248 0.9583983 0.92324592 0.9369596 ] mean value: 0.9185768799939413 key: test_jcc value: [0.9 0.66666667 0.59375 0.75 0.8 0.69565217 0.80952381 0.85 0.85714286 0.83333333] mean value: 0.7756068840579711 key: train_jcc value: [0.88823529 0.89880952 0.61797753 0.92857143 0.84337349 0.91954023 0.8700565 0.91954023 0.86170213 0.87573964] mean value: 0.8623545998139636 MCC on Blind test: 0.83 Accuracy on Blind test: 0.91 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01700234 0.01797867 0.01679993 0.01786995 0.0184319 0.01814771 0.01903105 0.01888227 0.0194633 0.01629138] mean value: 0.017989850044250487 key: score_time value: [0.01207399 0.01205277 0.012187 0.01223421 0.01236439 0.01218557 0.01222467 0.01221824 0.01213694 0.01256537] mean value: 0.012224316596984863 key: test_mcc value: [0.78947368 0.16439899 0.73786479 0.7163504 0.84959079 0.68035483 0.83871328 0.67849265 0.94721815 0.40611643] mean value: 0.6808573989700671 key: train_mcc value: [0.89255789 0.36277429 0.87738561 0.86960067 0.89521641 0.92337258 0.87750371 0.90781863 0.79887733 0.48613777] mean value: 0.7891244887522015 key: test_accuracy value: [0.89473684 0.52631579 0.86842105 0.83783784 0.91891892 0.83783784 0.91891892 0.83783784 0.97297297 0.64864865] mean value: 0.82624466571835 key: train_accuracy value: [0.94626866 0.6119403 0.93731343 0.93154762 0.94642857 0.96130952 0.9375 0.95238095 0.88988095 0.69345238] mean value: 0.8808022388059702 key: test_fscore value: [0.89473684 0.67857143 0.86486486 0.8125 0.92307692 0.84210526 0.91428571 0.82352941 0.97142857 0.43478261] mean value: 0.8159881627951018 key: train_fscore value: [0.94512195 0.7173913 0.93877551 0.92556634 0.94767442 0.96 0.93416928 0.94968553 0.87457627 0.55021834] mean value: 0.8743178952803997 key: test_precision value: [0.89473684 0.51351351 0.88888889 1. 0.85714286 0.8 0.94117647 0.875 1. 1. ] mean value: 0.8770458572238757 key: train_precision value: [0.95092025 0.55932203 0.90449438 0.99305556 0.91573034 0.98113208 0.97385621 0.99342105 1. 1. ] mean value: 0.9271931891207361 key: test_recall value: [0.89473684 1. 0.84210526 0.68421053 1. 0.88888889 0.88888889 0.77777778 0.94444444 0.27777778] mean value: 0.8198830409356725 key: train_recall value: [0.93939394 1. 0.97575758 0.86666667 0.98192771 0.93975904 0.89759036 0.90963855 0.77710843 0.37951807] mean value: 0.8667360350492881 key: test_roc_auc value: [0.89473684 0.52631579 0.86842105 0.84210526 0.92105263 0.83918129 0.91812865 0.83625731 0.97222222 0.63888889] mean value: 0.8257309941520468 key: train_roc_auc value: [0.94616756 0.61764706 0.93787879 0.93040936 0.94684621 0.96105599 0.93703047 0.9518781 0.88855422 0.68975904] mean value: 0.8807226786873548 key: test_jcc value: [0.80952381 0.51351351 0.76190476 0.68421053 0.85714286 0.72727273 0.84210526 0.7 0.94444444 0.27777778] mean value: 0.7117895681053575 key: train_jcc value: [0.89595376 0.55932203 0.88461538 0.86144578 0.90055249 0.92307692 0.87647059 0.90419162 0.77710843 0.37951807] mean value: 0.7962255079162279 MCC on Blind test: 0.7 Accuracy on Blind test: 0.84 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.16781831 0.1466465 0.15688443 0.15079045 0.14964795 0.15363979 0.15089941 0.15201402 0.14735866 0.15484333] mean value: 0.15305428504943847 key: score_time value: [0.01549721 0.01608682 0.01628304 0.01581788 0.01523232 0.01664662 0.01632547 0.01588321 0.01541471 0.01603723] mean value: 0.01592245101928711 key: test_mcc value: [0.9486833 0.89473684 0.85280287 0.83918129 0.89736456 0.89181287 0.94736842 0.94721815 1. 0.94736842] mean value: 0.9166536711379966 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.92105263 0.91891892 0.94594595 0.94594595 0.97297297 0.97297297 1. 0.97297297] mean value: 0.9571834992887625 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.94736842 0.91428571 0.91891892 0.94736842 0.94444444 0.97297297 0.97142857 1. 0.97297297] mean value: 0.9564119411487833 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94736842 1. 0.94444444 0.9 0.94444444 0.94736842 1. 1. 0.94736842] mean value: 0.9580994152046783 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94736842 0.84210526 0.89473684 1. 0.94444444 1. 0.94444444 1. 1. ] mean value: 0.9573099415204678 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.92105263 0.91959064 0.94736842 0.94590643 0.97368421 0.97222222 1. 0.97368421] mean value: 0.9574561403508772 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.9 0.84210526 0.85 0.9 0.89473684 0.94736842 0.94444444 1. 0.94736842] mean value: 0.9176023391812865 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.96 Accuracy on Blind test: 0.98 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.05195355 0.04780698 0.07163262 0.05937195 0.06207514 0.06070542 0.06475472 0.06434655 0.0617311 0.04681015] mean value: 0.05911881923675537 key: score_time value: [0.02426767 0.02480149 0.030478 0.02772665 0.03134751 0.02338552 0.03266764 0.02413774 0.01961756 0.02972317] mean value: 0.026815295219421387 key: test_mcc value: [0.9486833 0.89973541 0.76376262 0.94736842 0.89736456 0.89181287 0.94736842 0.89181287 1. 0.94736842] mean value: 0.9135276881295149 key: train_mcc value: [0.99404571 1. 0.99404571 1. 1. 0.99406397 0.99406397 0.98816193 0.98229327 0.98229327] mean value: 0.9928967834016768 key: test_accuracy value: [0.97368421 0.94736842 0.86842105 0.97297297 0.94594595 0.94594595 0.97297297 0.94594595 1. 0.97297297] mean value: 0.9546230440967283 key: train_accuracy value: [0.99701493 1. 0.99701493 1. 1. 0.99702381 0.99702381 0.99404762 0.99107143 0.99107143] mean value: 0.9964267945984364 key: test_fscore value: [0.97435897 0.94444444 0.84848485 0.97297297 0.94736842 0.94444444 0.97297297 0.94444444 1. 0.97297297] mean value: 0.9522464496148707 key: train_fscore value: [0.99696049 1. 0.99696049 1. 1. 0.99697885 0.99697885 0.99393939 0.99088146 0.99088146] mean value: 0.9963580988444394 key: test_precision value: [0.95 1. 1. 1. 0.9 0.94444444 0.94736842 0.94444444 1. 0.94736842] mean value: 0.9633625730994152 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.89473684 0.73684211 0.94736842 1. 0.94444444 1. 0.94444444 1. 1. ] mean value: 0.9467836257309942 key: train_recall value: [0.99393939 1. 0.99393939 1. 1. 0.9939759 0.9939759 0.98795181 0.98192771 0.98192771] mean value: 0.9927637824023366 key: test_roc_auc value: [0.97368421 0.94736842 0.86842105 0.97368421 0.94736842 0.94590643 0.97368421 0.94590643 1. 0.97368421] mean value: 0.9549707602339181 key: train_roc_auc value: [0.9969697 1. 0.9969697 1. 1. 0.99698795 0.99698795 0.9939759 0.99096386 0.99096386] mean value: 0.9963818912011683 key: test_jcc value: [0.95 0.89473684 0.73684211 0.94736842 0.9 0.89473684 0.94736842 0.89473684 1. 0.94736842] mean value: 0.9113157894736842 key: train_jcc value: [0.99393939 1. 0.99393939 1. 1. 0.9939759 0.9939759 0.98795181 0.98192771 0.98192771] mean value: 0.9927637824023366 MCC on Blind test: 0.9 Accuracy on Blind test: 0.95 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.07039738 0.0819912 0.04745841 0.07536864 0.0871973 0.08062267 0.09268117 0.08872008 0.07291102 0.04991651] mean value: 0.07472643852233887 key: score_time value: [0.02195954 0.01393104 0.01386929 0.02197075 0.02660823 0.0219326 0.02215648 0.02231789 0.01488566 0.02564073] mean value: 0.020527219772338866 key: test_mcc value: [0.79388419 0.31622777 0.69989647 0.42489158 0.7888597 0.29766651 0.56934383 0.56725146 0.62170355 0.4670794 ] mean value: 0.554680445041403 key: train_mcc value: [0.99404571 0.99404571 0.99404571 0.99406271 1. 1. 0.99406397 0.99406397 0.99406397 0.99406397] mean value: 0.9952455741790717 key: test_accuracy value: [0.89473684 0.65789474 0.84210526 0.7027027 0.89189189 0.64864865 0.78378378 0.78378378 0.81081081 0.72972973] mean value: 0.7746088193456615 key: train_accuracy value: [0.99701493 0.99701493 0.99701493 0.99702381 1. 1. 0.99702381 0.99702381 0.99702381 0.99702381] mean value: 0.9976163823738451 key: test_fscore value: [0.88888889 0.66666667 0.82352941 0.66666667 0.89473684 0.60606061 0.76470588 0.77777778 0.8 0.6875 ] mean value: 0.7576532742283516 key: train_fscore value: [0.99696049 0.99696049 0.99696049 0.99696049 1. 1. 0.99697885 0.99697885 0.99697885 0.99697885] mean value: 0.9975757353143739 key: test_precision value: [0.94117647 0.65 0.93333333 0.78571429 0.85 0.66666667 0.8125 0.77777778 0.82352941 0.78571429] mean value: 0.802641223155929 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.84210526 0.68421053 0.73684211 0.57894737 0.94444444 0.55555556 0.72222222 0.77777778 0.77777778 0.61111111] mean value: 0.7230994152046784 key: train_recall value: [0.99393939 0.99393939 0.99393939 0.99393939 1. 1. 0.9939759 0.9939759 0.9939759 0.9939759 ] mean value: 0.9951661190215407 key: test_roc_auc value: [0.89473684 0.65789474 0.84210526 0.70614035 0.89327485 0.64619883 0.78216374 0.78362573 0.80994152 0.72660819] mean value: 0.7742690058479532 key: train_roc_auc value: [0.9969697 0.9969697 0.9969697 0.9969697 1. 1. 0.99698795 0.99698795 0.99698795 0.99698795] mean value: 0.9975830595107703 key: test_jcc value: [0.8 0.5 0.7 0.5 0.80952381 0.43478261 0.61904762 0.63636364 0.66666667 0.52380952] mean value: 0.6190193864106908 key: train_jcc value: [0.99393939 0.99393939 0.99393939 0.99393939 1. 1. 0.9939759 0.9939759 0.9939759 0.9939759 ] mean value: 0.9951661190215407 MCC on Blind test: 0.54 Accuracy on Blind test: 0.77 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.56135869 0.54637766 0.55692697 0.5550313 0.54440451 0.53952956 0.54246831 0.550493 0.56960869 0.54170847] mean value: 0.5507907152175904 key: score_time value: [0.00987864 0.00982213 0.00943971 0.00938654 0.00945401 0.00924611 0.0097692 0.01039886 0.00971866 0.00918841] mean value: 0.009630227088928222 key: test_mcc value: [0.9486833 0.84327404 0.89973541 0.89181287 0.89736456 0.89181287 0.94736842 0.89181287 1. 0.89736456] mean value: 0.9109228893996735 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.92105263 0.94736842 0.94594595 0.94594595 0.94594595 0.97297297 0.94594595 1. 0.94594595] mean value: 0.9544807965860598 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.91891892 0.94444444 0.94736842 0.94736842 0.94444444 0.97297297 0.94444444 1. 0.94736842] mean value: 0.9541689462742095 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94444444 1. 0.94736842 0.9 0.94444444 0.94736842 0.94444444 1. 0.9 ] mean value: 0.9478070175438597 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.89473684 0.89473684 0.94736842 1. 0.94444444 1. 0.94444444 1. 1. ] mean value: 0.9625730994152046 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.92105263 0.94736842 0.94590643 0.94736842 0.94590643 0.97368421 0.94590643 1. 0.94736842] mean value: 0.9548245614035088 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.85 0.89473684 0.9 0.9 0.89473684 0.94736842 0.89473684 1. 0.9 ] mean value: 0.9131578947368421 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.91 Accuracy on Blind test: 0.96 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02600288 0.02697515 0.02713609 0.02771711 0.0268352 0.02741385 0.02773118 0.04674792 0.04764318 0.03253341] mean value: 0.03167359828948975 key: score_time value: [0.01562953 0.01249695 0.01259661 0.01253057 0.01317787 0.04645109 0.0191071 0.01679397 0.01876235 0.01601005] mean value: 0.018355607986450195 key: test_mcc value: [0.48454371 0.31622777 0.45291081 0.26327408 0.51319869 0.24408665 0.13259028 0.35104619 0.52960948 0.18768409] mean value: 0.3475171758801787 key: train_mcc value: [0.88134724 0.68582485 0.76325259 0.70543403 0.91426696 0.9285613 0.70124655 0.74453167 0.9253171 0.7690121 ] mean value: 0.8018794389354479 key: test_accuracy value: [0.73684211 0.65789474 0.71052632 0.62162162 0.75675676 0.62162162 0.56756757 0.67567568 0.75675676 0.59459459] mean value: 0.6699857752489331 key: train_accuracy value: [0.93731343 0.82089552 0.86865672 0.83333333 0.95535714 0.96428571 0.83035714 0.85714286 0.96130952 0.87202381] mean value: 0.8900675195451315 key: test_fscore value: [0.70588235 0.64864865 0.64516129 0.5625 0.74285714 0.5625 0.5 0.64705882 0.70967742 0.57142857] mean value: 0.629571424908237 key: train_fscore value: [0.93203883 0.77777778 0.84615385 0.79562044 0.95268139 0.96385542 0.79272727 0.83098592 0.95924765 0.85121107] mean value: 0.8702299616326061 key: test_precision value: [0.8 0.66666667 0.83333333 0.69230769 0.76470588 0.64285714 0.57142857 0.6875 0.84615385 0.58823529] mean value: 0.709318842921784 key: train_precision value: [1. 1. 1. 1. 1. 0.96385542 1. 1. 1. 1. ] mean value: 0.9963855421686747 key: test_recall value: [0.63157895 0.63157895 0.52631579 0.47368421 0.72222222 0.5 0.44444444 0.61111111 0.61111111 0.55555556] mean value: 0.5707602339181287 key: train_recall value: [0.87272727 0.63636364 0.73333333 0.66060606 0.90963855 0.96385542 0.65662651 0.71084337 0.92168675 0.74096386] mean value: 0.7806644760861629 key: test_roc_auc value: [0.73684211 0.65789474 0.71052632 0.62573099 0.75584795 0.61842105 0.56432749 0.67397661 0.75292398 0.59356725] mean value: 0.6690058479532164 key: train_roc_auc value: [0.93636364 0.81818182 0.86666667 0.83030303 0.95481928 0.96428065 0.82831325 0.85542169 0.96084337 0.87048193] mean value: 0.8885675321607285 key: test_jcc value: [0.54545455 0.48 0.47619048 0.39130435 0.59090909 0.39130435 0.33333333 0.47826087 0.55 0.4 ] mean value: 0.46367570111048373 key: train_jcc value: [0.87272727 0.63636364 0.73333333 0.66060606 0.90963855 0.93023256 0.65662651 0.71084337 0.92168675 0.74096386] mean value: 0.7773021897314416 MCC on Blind test: 0.45 Accuracy on Blind test: 0.72 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02517295 0.03688502 0.04392886 0.03298044 0.03695798 0.03685045 0.03669524 0.03654885 0.03679013 0.0368619 ] mean value: 0.03596718311309814 key: score_time value: [0.02065396 0.02096987 0.02294397 0.02419424 0.02160645 0.01831365 0.0233593 0.02103519 0.02230525 0.02347684] mean value: 0.02188587188720703 key: test_mcc value: [0.89473684 0.47633051 0.68803296 0.68035483 0.89736456 0.62280702 0.89736456 0.89181287 1. 0.89679028] mean value: 0.7945594433337401 key: train_mcc value: [0.89251337 0.89259616 0.88671444 0.89290904 0.88691246 0.88700621 0.88101481 0.8870542 0.88691246 0.88691246] mean value: 0.8880545619279835 key: test_accuracy value: [0.94736842 0.73684211 0.84210526 0.83783784 0.94594595 0.81081081 0.94594595 0.94594595 1. 0.94594595] mean value: 0.8958748221906117 key: train_accuracy value: [0.94626866 0.94626866 0.94328358 0.94642857 0.94345238 0.94345238 0.94047619 0.94345238 0.94345238 0.94345238] mean value: 0.9439987562189055 key: test_fscore value: [0.94736842 0.75 0.83333333 0.83333333 0.94736842 0.81081081 0.94736842 0.94444444 1. 0.94117647] mean value: 0.8955203655668051 key: train_fscore value: [0.94545455 0.94578313 0.94294294 0.94578313 0.94294294 0.94224924 0.94011976 0.94328358 0.94294294 0.94294294] mean value: 0.9434445164976734 key: test_precision value: [0.94736842 0.71428571 0.88235294 0.88235294 0.9 0.78947368 0.9 0.94444444 1. 1. ] mean value: 0.8960278146346258 key: train_precision value: [0.94545455 0.94011976 0.93452381 0.94011976 0.94011976 0.95092025 0.93452381 0.93491124 0.94011976 0.94011976] mean value: 0.9400932454899698 key: test_recall value: [0.94736842 0.78947368 0.78947368 0.78947368 1. 0.83333333 1. 0.94444444 1. 0.88888889] mean value: 0.8982456140350877 key: train_recall value: [0.94545455 0.95151515 0.95151515 0.95151515 0.94578313 0.93373494 0.94578313 0.95180723 0.94578313 0.94578313] mean value: 0.9468674698795181 key: test_roc_auc value: [0.94736842 0.73684211 0.84210526 0.83918129 0.94736842 0.81140351 0.94736842 0.94590643 1. 0.94444444] mean value: 0.8961988304093568 key: train_roc_auc value: [0.94625668 0.94634581 0.94340463 0.94651781 0.9434798 0.94333806 0.94053863 0.94355067 0.9434798 0.9434798 ] mean value: 0.9440391700962778 key: test_jcc value: [0.9 0.6 0.71428571 0.71428571 0.9 0.68181818 0.9 0.89473684 1. 0.88888889] mean value: 0.8194015341383762 key: train_jcc value: [0.89655172 0.89714286 0.89204545 0.89714286 0.89204545 0.8908046 0.88700565 0.89265537 0.89204545 0.89204545] mean value: 0.8929484871255766 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.25039506 0.26338816 0.26077509 0.28984904 0.34190536 0.20279026 0.25511289 0.26039338 0.27494764 0.29335022] mean value: 0.26929070949554446 key: score_time value: [0.02418995 0.02259231 0.02381563 0.0246079 0.02679062 0.02062678 0.02431607 0.02038789 0.02436209 0.03717709] mean value: 0.02488663196563721 key: test_mcc value: [0.89473684 0.47633051 0.65465367 0.68035483 0.89736456 0.62280702 0.78362573 0.89181287 1. 0.89679028] mean value: 0.7798476311382899 key: train_mcc value: [0.89251337 0.89259616 0.95822045 0.89290904 0.88691246 0.88700621 0.80949681 0.8870542 0.88691246 0.88691246] mean value: 0.8880533628410842 key: test_accuracy value: [0.94736842 0.73684211 0.81578947 0.83783784 0.94594595 0.81081081 0.89189189 0.94594595 1. 0.94594595] mean value: 0.8878378378378379 key: train_accuracy value: [0.94626866 0.94626866 0.97910448 0.94642857 0.94345238 0.94345238 0.9047619 0.94345238 0.94345238 0.94345238] mean value: 0.9440094171997157 key: test_fscore value: [0.94736842 0.75 0.78787879 0.83333333 0.94736842 0.81081081 0.88888889 0.94444444 1. 0.94117647] mean value: 0.8851269578049764 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:115: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:118: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.94545455 0.94578313 0.97885196 0.94578313 0.94294294 0.94224924 0.90361446 0.94328358 0.94294294 0.94294294] mean value: 0.9433848883132298 key: test_precision value: [0.94736842 0.71428571 0.92857143 0.88235294 0.9 0.78947368 0.88888889 0.94444444 1. 1. ] mean value: 0.8995385522630105 key: train_precision value: [0.94545455 0.94011976 0.97590361 0.94011976 0.94011976 0.95092025 0.90361446 0.93491124 0.94011976 0.94011976] mean value: 0.9411402908141235 key: test_recall value: [0.94736842 0.78947368 0.68421053 0.78947368 1. 0.83333333 0.88888889 0.94444444 1. 0.88888889] mean value: 0.876608187134503 key: train_recall value: [0.94545455 0.95151515 0.98181818 0.95151515 0.94578313 0.93373494 0.90361446 0.95180723 0.94578313 0.94578313] mean value: 0.9456809054399415 key: test_roc_auc value: [0.94736842 0.73684211 0.81578947 0.83918129 0.94736842 0.81140351 0.89181287 0.94590643 1. 0.94444444] mean value: 0.8880116959064328 key: train_roc_auc value: [0.94625668 0.94634581 0.97914439 0.94651781 0.9434798 0.94333806 0.90474841 0.94355067 0.9434798 0.9434798 ] mean value: 0.9440341231706072 key: test_jcc value: [0.9 0.6 0.65 0.71428571 0.9 0.68181818 0.8 0.89473684 1. 0.88888889] mean value: 0.8029729627098048 key: train_jcc value: [0.89655172 0.89714286 0.95857988 0.89714286 0.89204545 0.8908046 0.82417582 0.89265537 0.89204545 0.89204545] mean value: 0.8933189472825426 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03251314 0.03675842 0.04648995 0.03551435 0.03649688 0.03578806 0.03548098 0.0361886 0.02459049 0.03698349] mean value: 0.0356804370880127 key: score_time value: [0.01281786 0.0131588 0.01396585 0.01327658 0.01287436 0.01276183 0.0128417 0.01410079 0.01277828 0.01403594] mean value: 0.013261198997497559 key: test_mcc value: [0.9486833 0.9486833 0.73786479 0.84327404 0.73786479 0.63245553 0.89473684 0.79388419 0.62807634 0.78764146] mean value: 0.7953164573929877 key: train_mcc value: [0.85307402 0.87648575 0.87660709 0.85906136 0.86472084 0.87660709 0.88241401 0.87648575 0.88269694 0.88275364] mean value: 0.8730906512039277 key: test_accuracy value: [0.97368421 0.97368421 0.86842105 0.92105263 0.86842105 0.81578947 0.94736842 0.89473684 0.81081081 0.89189189] mean value: 0.8965860597439544 key: train_accuracy value: [0.92647059 0.93823529 0.93823529 0.92941176 0.93235294 0.93823529 0.94117647 0.93823529 0.94134897 0.94134897] mean value: 0.9365050888390547 key: test_fscore value: [0.97435897 0.97435897 0.86486486 0.92307692 0.86486486 0.81081081 0.94736842 0.9 0.82926829 0.88235294] mean value: 0.8971325067247442 key: train_fscore value: [0.9271137 0.93841642 0.93877551 0.93023256 0.93255132 0.93877551 0.94152047 0.9380531 0.94117647 0.94186047] mean value: 0.9368475523992993 key: test_precision value: [0.95 0.95 0.88888889 0.9 0.88888889 0.83333333 0.94736842 0.85714286 0.77272727 0.9375 ] mean value: 0.8925849662033872 key: train_precision value: [0.91907514 0.93567251 0.93063584 0.91954023 0.92982456 0.93063584 0.93604651 0.9408284 0.94117647 0.93641618] mean value: 0.9319851696271803 key: test_recall value: [1. 1. 0.84210526 0.94736842 0.84210526 0.78947368 0.94736842 0.94736842 0.89473684 0.83333333] mean value: 0.9043859649122807 key: train_recall value: [0.93529412 0.94117647 0.94705882 0.94117647 0.93529412 0.94705882 0.94705882 0.93529412 0.94117647 0.94736842] mean value: 0.9417956656346749 key: test_roc_auc value: [0.97368421 0.97368421 0.86842105 0.92105263 0.86842105 0.81578947 0.94736842 0.89473684 0.80847953 0.89035088] mean value: 0.8961988304093567 key: train_roc_auc value: [0.92647059 0.93823529 0.93823529 0.92941176 0.93235294 0.93823529 0.94117647 0.93823529 0.94134847 0.94133127] mean value: 0.9365032679738562 key: test_jcc value: [0.95 0.95 0.76190476 0.85714286 0.76190476 0.68181818 0.9 0.81818182 0.70833333 0.78947368] mean value: 0.817875939849624 key: train_jcc value: [0.86413043 0.8839779 0.88461538 0.86956522 0.87362637 0.88461538 0.88950276 0.88333333 0.88888889 0.89010989] mean value: 0.8812365570346593 MCC on Blind test: 0.84 Accuracy on Blind test: 0.92 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.75315642 1.01591659 0.78028083 0.772331 0.9129324 0.76234174 0.9127357 0.87172246 0.7669878 0.96696448] mean value: 0.8515369415283203 key: score_time value: [0.01336694 0.01260972 0.01373863 0.01248026 0.01332521 0.01331091 0.01295376 0.01237464 0.01341295 0.01255274] mean value: 0.013012576103210449 key: test_mcc value: [0.9486833 0.89973541 0.73786479 0.79388419 0.78947368 0.63245553 0.84327404 0.79388419 0.68035483 0.78764146] mean value: 0.790725142095092 key: train_mcc value: [0.88825066 0.89417953 1. 0.8177744 0.90014017 0.91178048 0.97647059 0.8058963 1. 0.90030617] mean value: 0.9094798300569955 key: test_accuracy value: [0.97368421 0.94736842 0.86842105 0.89473684 0.89473684 0.81578947 0.92105263 0.89473684 0.83783784 0.89189189] mean value: 0.8940256045519204 key: train_accuracy value: [0.94411765 0.94705882 1. 0.90882353 0.95 0.95588235 0.98823529 0.90294118 1. 0.95014663] mean value: 0.9547205451095394 key: test_fscore value: [0.97435897 0.95 0.87179487 0.9 0.89473684 0.81081081 0.92307692 0.9 0.83333333 0.88235294] mean value: 0.8940464696656647 key: train_fscore value: [0.94428152 0.94736842 1. 0.90962099 0.95043732 0.95601173 0.98823529 0.90265487 1. 0.95043732] mean value: 0.9549047464381037 key: test_precision value: [0.95 0.9047619 0.85 0.85714286 0.89473684 0.83333333 0.9 0.85714286 0.88235294 0.9375 ] mean value: 0.8866970735662686 key: train_precision value: [0.94152047 0.94186047 1. 0.9017341 0.94219653 0.95321637 0.98823529 0.90532544 1. 0.94767442] mean value: 0.9521763099568973 key: test_recall value: [1. 1. 0.89473684 0.94736842 0.89473684 0.78947368 0.94736842 0.94736842 0.78947368 0.83333333] mean value: 0.9043859649122807 key: train_recall value: [0.94705882 0.95294118 1. 0.91764706 0.95882353 0.95882353 0.98823529 0.9 1. 0.95321637] mean value: 0.9576745786033711 key: test_roc_auc value: [0.97368421 0.94736842 0.86842105 0.89473684 0.89473684 0.81578947 0.92105263 0.89473684 0.83918129 0.89035088] mean value: 0.8940058479532164 key: train_roc_auc value: [0.94411765 0.94705882 1. 0.90882353 0.95 0.95588235 0.98823529 0.90294118 1. 0.9501376 ] mean value: 0.9547196422428621 key: test_jcc value: [0.95 0.9047619 0.77272727 0.81818182 0.80952381 0.68181818 0.85714286 0.81818182 0.71428571 0.78947368] mean value: 0.8116097060833903 key: train_jcc value: [0.89444444 0.9 1. 0.8342246 0.90555556 0.91573034 0.97674419 0.82258065 1. 0.90555556] mean value: 0.9154835322772491 MCC on Blind test: 0.76 Accuracy on Blind test: 0.88 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01436734 0.01217151 0.01070666 0.00985885 0.00983238 0.01114702 0.01055717 0.01079178 0.0106504 0.01036644] mean value: 0.011044955253601075 key: score_time value: [0.01219034 0.00957274 0.00930095 0.00910378 0.00889158 0.00985646 0.00977993 0.00970864 0.00965691 0.00958467] mean value: 0.009764599800109863 key: test_mcc value: [0.68803296 0.69989647 0.69989647 0.79388419 0.57894737 0.63960215 0.47633051 0.73786479 0.57184997 0.69007214] mean value: 0.6576377014238246 key: train_mcc value: [0.66106903 0.63812671 0.63133581 0.66254793 0.63888551 0.6871247 0.63426969 0.68813955 0.67443892 0.63456594] mean value: 0.6550503783024093 key: test_accuracy value: [0.84210526 0.84210526 0.84210526 0.89473684 0.78947368 0.81578947 0.73684211 0.86842105 0.78378378 0.83783784] mean value: 0.8253200568990042 key: train_accuracy value: [0.82941176 0.81764706 0.80588235 0.82941176 0.81764706 0.84117647 0.81470588 0.84117647 0.83577713 0.81524927] mean value: 0.8248085216491289 key: test_fscore value: [0.83333333 0.82352941 0.82352941 0.88888889 0.78947368 0.8 0.72222222 0.87179487 0.77777778 0.8125 ] mean value: 0.8143049601757032 key: train_fscore value: [0.82208589 0.80864198 0.77852349 0.81987578 0.80745342 0.83125 0.80250784 0.83018868 0.82716049 0.80495356] mean value: 0.813264111779322 key: test_precision value: [0.88235294 0.93333333 0.93333333 0.94117647 0.78947368 0.875 0.76470588 0.85 0.82352941 0.92857143] mean value: 0.8721476485330975 key: train_precision value: [0.85897436 0.85064935 0.90625 0.86842105 0.85526316 0.88666667 0.8590604 0.89189189 0.87012987 0.85526316] mean value: 0.8702569909417754 key: test_recall value: [0.78947368 0.73684211 0.73684211 0.84210526 0.78947368 0.73684211 0.68421053 0.89473684 0.73684211 0.72222222] mean value: 0.7669590643274854 key: train_recall value: [0.78823529 0.77058824 0.68235294 0.77647059 0.76470588 0.78235294 0.75294118 0.77647059 0.78823529 0.76023392] mean value: 0.7642586859305125 key: test_roc_auc value: [0.84210526 0.84210526 0.84210526 0.89473684 0.78947368 0.81578947 0.73684211 0.86842105 0.78508772 0.83479532] mean value: 0.8251461988304094 key: train_roc_auc value: [0.82941176 0.81764706 0.80588235 0.82941176 0.81764706 0.84117647 0.81470588 0.84117647 0.83563811 0.81541108] mean value: 0.8248108015135879 key: test_jcc value: [0.71428571 0.7 0.7 0.8 0.65217391 0.66666667 0.56521739 0.77272727 0.63636364 0.68421053] mean value: 0.6891645120706905 key: train_jcc value: [0.69791667 0.67875648 0.63736264 0.69473684 0.67708333 0.71122995 0.67015707 0.70967742 0.70526316 0.67357513] mean value: 0.6855758677521984 MCC on Blind test: 0.71 Accuracy on Blind test: 0.85 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01090002 0.01093769 0.0109024 0.01084542 0.010324 0.01106477 0.01059437 0.01046681 0.00997615 0.01118135] mean value: 0.01071929931640625 key: score_time value: [0.0093627 0.0100379 0.00968313 0.00984097 0.01014543 0.0097847 0.00955224 0.00931573 0.00902176 0.00988913] mean value: 0.00966336727142334 key: test_mcc value: [0.52704628 0.68421053 0.68421053 0.79388419 0.78947368 0.52704628 0.78947368 0.63960215 0.48078072 0.57857577] mean value: 0.6494303793257087 key: train_mcc value: [0.74738216 0.73530684 0.77092175 0.74199852 0.72941176 0.74163853 0.72354193 0.77092175 0.75366357 0.73705515] mean value: 0.745184196259884 key: test_accuracy value: [0.76315789 0.84210526 0.84210526 0.89473684 0.89473684 0.76315789 0.89473684 0.81578947 0.72972973 0.78378378] mean value: 0.8224039829302987 key: train_accuracy value: [0.87352941 0.86764706 0.88529412 0.87058824 0.86470588 0.87058824 0.86176471 0.88529412 0.87683284 0.86803519] mean value: 0.8724279799896498 key: test_fscore value: [0.76923077 0.84210526 0.84210526 0.9 0.89473684 0.76923077 0.89473684 0.82926829 0.77272727 0.75 ] mean value: 0.8264141314398054 key: train_fscore value: [0.87536232 0.86803519 0.88695652 0.87356322 0.86470588 0.87283237 0.86135693 0.88695652 0.87647059 0.87179487] mean value: 0.8738034415804177 key: test_precision value: [0.75 0.84210526 0.84210526 0.85714286 0.89473684 0.75 0.89473684 0.77272727 0.68 0.85714286] mean value: 0.8140697197539303 key: train_precision value: [0.86285714 0.86549708 0.87428571 0.85393258 0.86470588 0.85795455 0.86390533 0.87428571 0.87647059 0.85 ] mean value: 0.8643894573208194 key: test_recall value: [0.78947368 0.84210526 0.84210526 0.94736842 0.89473684 0.78947368 0.89473684 0.89473684 0.89473684 0.66666667] mean value: 0.8456140350877193 key: train_recall value: [0.88823529 0.87058824 0.9 0.89411765 0.86470588 0.88823529 0.85882353 0.9 0.87647059 0.89473684] mean value: 0.8835913312693499 key: test_roc_auc value: [0.76315789 0.84210526 0.84210526 0.89473684 0.89473684 0.76315789 0.89473684 0.81578947 0.7251462 0.78070175] mean value: 0.8216374269005848 key: train_roc_auc value: [0.87352941 0.86764706 0.88529412 0.87058824 0.86470588 0.87058824 0.86176471 0.88529412 0.87683179 0.86795666] mean value: 0.8724200206398349 key: test_jcc value: [0.625 0.72727273 0.72727273 0.81818182 0.80952381 0.625 0.80952381 0.70833333 0.62962963 0.6 ] mean value: 0.7079737854737855 key: train_jcc value: [0.77835052 0.76683938 0.796875 0.7755102 0.76165803 0.77435897 0.75647668 0.796875 0.78010471 0.77272727] mean value: 0.7759775771937931 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.010535 0.01050043 0.01047206 0.01034617 0.01016498 0.01028228 0.01025391 0.01106501 0.01017475 0.00908756] mean value: 0.010288214683532715 key: score_time value: [0.01397395 0.01574492 0.01688623 0.01823497 0.0147984 0.01664305 0.01726961 0.01742172 0.01480031 0.01511502] mean value: 0.01608881950378418 key: test_mcc value: [0.57894737 0.57894737 0.31622777 0.57894737 0.43643578 0.42163702 0.42640143 0.21821789 0.40780312 0.75614764] mean value: 0.4719712759930457 key: train_mcc value: [0.65304287 0.6882472 0.68254191 0.65322377 0.69416569 0.69455037 0.65886913 0.67657595 0.73636217 0.607149 ] mean value: 0.6744728059621842 key: test_accuracy value: [0.78947368 0.78947368 0.65789474 0.78947368 0.71052632 0.71052632 0.71052632 0.60526316 0.7027027 0.86486486] mean value: 0.7330725462304409 key: train_accuracy value: [0.82647059 0.84411765 0.84117647 0.82647059 0.84705882 0.84705882 0.82941176 0.83823529 0.86803519 0.80351906] mean value: 0.8371554252199414 key: test_fscore value: [0.78947368 0.78947368 0.64864865 0.78947368 0.74418605 0.71794872 0.68571429 0.54545455 0.73170732 0.83870968] mean value: 0.7280790291401931 key: train_fscore value: [0.82798834 0.84457478 0.84302326 0.8238806 0.84615385 0.84971098 0.83040936 0.83965015 0.86567164 0.80235988] mean value: 0.8373422826187441 key: test_precision value: [0.78947368 0.78947368 0.66666667 0.78947368 0.66666667 0.7 0.75 0.64285714 0.68181818 1. ] mean value: 0.7476429710640237 key: train_precision value: [0.82080925 0.84210526 0.83333333 0.83636364 0.85119048 0.83522727 0.8255814 0.83236994 0.87878788 0.80952381] mean value: 0.8365292256184584 key: test_recall value: [0.78947368 0.78947368 0.63157895 0.78947368 0.84210526 0.73684211 0.63157895 0.47368421 0.78947368 0.72222222] mean value: 0.7195906432748538 key: train_recall value: [0.83529412 0.84705882 0.85294118 0.81176471 0.84117647 0.86470588 0.83529412 0.84705882 0.85294118 0.79532164] mean value: 0.8383556931544548 key: test_roc_auc value: [0.78947368 0.78947368 0.65789474 0.78947368 0.71052632 0.71052632 0.71052632 0.60526316 0.7002924 0.86111111] mean value: 0.7324561403508772 key: train_roc_auc value: [0.82647059 0.84411765 0.84117647 0.82647059 0.84705882 0.84705882 0.82941176 0.83823529 0.86799106 0.80354317] mean value: 0.8371534227726178 key: test_jcc value: [0.65217391 0.65217391 0.48 0.65217391 0.59259259 0.56 0.52173913 0.375 0.57692308 0.72222222] mean value: 0.578499876130311 key: train_jcc value: [0.70646766 0.73096447 0.72864322 0.70050761 0.73333333 0.73869347 0.71 0.72361809 0.76315789 0.66995074] mean value: 0.7205336483765594 MCC on Blind test: 0.49 Accuracy on Blind test: 0.74 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01583338 0.0182128 0.0161047 0.01597238 0.0159862 0.01671791 0.01628447 0.01583791 0.01580572 0.01597381] mean value: 0.016272926330566408 key: score_time value: [0.01147938 0.01161909 0.01118112 0.01065826 0.01088691 0.01061678 0.01073647 0.01067019 0.01060247 0.01049829] mean value: 0.01089489459991455 key: test_mcc value: [0.89473684 0.9486833 0.78947368 0.78947368 0.78947368 0.58218174 0.89473684 0.73786479 0.51793973 0.78764146] mean value: 0.773220574982312 key: train_mcc value: [0.80022155 0.79424133 0.80600787 0.80600787 0.80600787 0.81182089 0.78828985 0.8 0.82410816 0.80678035] mean value: 0.8043485716732236 key: test_accuracy value: [0.94736842 0.97368421 0.89473684 0.89473684 0.89473684 0.78947368 0.94736842 0.86842105 0.75675676 0.89189189] mean value: 0.8859174964438122 key: train_accuracy value: [0.9 0.89705882 0.90294118 0.90294118 0.90294118 0.90588235 0.89411765 0.9 0.91202346 0.90322581] mean value: 0.9021131619803346 key: test_fscore value: [0.94736842 0.97435897 0.89473684 0.89473684 0.89473684 0.77777778 0.94736842 0.87179487 0.7804878 0.88235294] mean value: 0.8865719738407196 key: train_fscore value: [0.90116279 0.89795918 0.90379009 0.90379009 0.90379009 0.90643275 0.89473684 0.9 0.9122807 0.90489914] mean value: 0.9028841664606161 key: test_precision value: [0.94736842 0.95 0.89473684 0.89473684 0.89473684 0.82352941 0.94736842 0.85 0.72727273 0.9375 ] mean value: 0.8867249507458486 key: train_precision value: [0.8908046 0.89017341 0.89595376 0.89595376 0.89595376 0.90116279 0.88953488 0.9 0.90697674 0.89204545] mean value: 0.8958559152932181 key: test_recall value: [0.94736842 1. 0.89473684 0.89473684 0.89473684 0.73684211 0.94736842 0.89473684 0.84210526 0.83333333] mean value: 0.8885964912280702 key: train_recall value: [0.91176471 0.90588235 0.91176471 0.91176471 0.91176471 0.91176471 0.9 0.9 0.91764706 0.91812865] mean value: 0.9100481596147231 key: test_roc_auc value: [0.94736842 0.97368421 0.89473684 0.89473684 0.89473684 0.78947368 0.94736842 0.86842105 0.75438596 0.89035088] mean value: 0.8855263157894737 key: train_roc_auc value: [0.9 0.89705882 0.90294118 0.90294118 0.90294118 0.90588235 0.89411765 0.9 0.9120399 0.90318197] mean value: 0.902110423116615 key: test_jcc value: [0.9 0.95 0.80952381 0.80952381 0.80952381 0.63636364 0.9 0.77272727 0.64 0.78947368] mean value: 0.8017136021872864 key: train_jcc value: [0.82010582 0.81481481 0.82446809 0.82446809 0.82446809 0.82887701 0.80952381 0.81818182 0.83870968 0.82631579] mean value: 0.8229932990186044 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.32342815 1.41316032 1.3804853 1.2588563 1.50634933 1.53654742 1.28589988 1.45925713 1.40028787 1.36803317] mean value: 1.3932304859161377 key: score_time value: [0.01891804 0.01499844 0.01266909 0.01278591 0.01561332 0.01240611 0.01493168 0.01475978 0.02181697 0.01519823] mean value: 0.015409755706787109 key: test_mcc value: [0.89473684 0.89973541 0.78947368 0.89973541 0.73786479 0.63245553 0.84327404 0.73786479 0.56725146 0.78764146] mean value: 0.7790033421492283 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.94736842 0.89473684 0.94736842 0.86842105 0.81578947 0.92105263 0.86842105 0.78378378 0.89189189] mean value: 0.8886201991465149 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.95 0.89473684 0.94444444 0.87179487 0.81081081 0.91891892 0.87179487 0.78947368 0.88235294] mean value: 0.8881695806308809 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94736842 0.9047619 0.89473684 1. 0.85 0.83333333 0.94444444 0.85 0.78947368 0.9375 ] mean value: 0.8951618629908104 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 1. 0.89473684 0.89473684 0.89473684 0.78947368 0.89473684 0.89473684 0.78947368 0.83333333] mean value: 0.8833333333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.94736842 0.89473684 0.94736842 0.86842105 0.81578947 0.92105263 0.86842105 0.78362573 0.89035088] mean value: 0.8884502923976608 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.9047619 0.80952381 0.89473684 0.77272727 0.68181818 0.85 0.77272727 0.65217391 0.78947368] mean value: 0.8027942880917709 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02541423 0.01783824 0.01625419 0.01690006 0.01592183 0.01577139 0.01574349 0.01575923 0.01588821 0.01514101] mean value: 0.017063188552856445 key: score_time value: [0.01225829 0.00914264 0.00886273 0.00880837 0.00879955 0.00871849 0.00876212 0.0086875 0.00877857 0.00877881] mean value: 0.009159708023071289 key: test_mcc value: [1. 0.84327404 0.79388419 0.9486833 0.84327404 0.89973541 0.85280287 0.89973541 0.68035483 0.89181287] mean value: 0.8653556953757348 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.92105263 0.89473684 0.97368421 0.92105263 0.94736842 0.92105263 0.94736842 0.83783784 0.94594595] mean value: 0.9310099573257468 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.92307692 0.88888889 0.97435897 0.92307692 0.94444444 0.92682927 0.95 0.83333333 0.94444444] mean value: 0.9308453199916614 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9 0.94117647 0.95 0.9 1. 0.86363636 0.9047619 0.88235294 0.94444444] mean value: 0.9286372124607418 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94736842 0.84210526 1. 0.94736842 0.89473684 1. 1. 0.78947368 0.94444444] mean value: 0.9365497076023391 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.92105263 0.89473684 0.97368421 0.92105263 0.94736842 0.92105263 0.94736842 0.83918129 0.94590643] mean value: 0.9311403508771929 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.85714286 0.8 0.95 0.85714286 0.89473684 0.86363636 0.9047619 0.71428571 0.89473684] mean value: 0.8736443381180223 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.89 Accuracy on Blind test: 0.95 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.10692024 0.10610008 0.10551596 0.10557318 0.1064024 0.10624623 0.10674787 0.10648775 0.10734653 0.10737967] mean value: 0.10647199153900147 key: score_time value: [0.01746225 0.01743841 0.01741266 0.01759815 0.01770043 0.01763558 0.01764989 0.0177474 0.01797533 0.01771569] mean value: 0.017633581161499025 key: test_mcc value: [0.9486833 0.89473684 0.63960215 0.79388419 0.73786479 0.73786479 0.84327404 0.79388419 0.56934383 0.83871328] mean value: 0.7797851390935022 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.81578947 0.89473684 0.86842105 0.86842105 0.92105263 0.89473684 0.78378378 0.91891892] mean value: 0.8886913229018493 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.94736842 0.8 0.9 0.86486486 0.87179487 0.92307692 0.9 0.8 0.91428571] mean value: 0.889574976943398 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94736842 0.875 0.85714286 0.88888889 0.85 0.9 0.85714286 0.76190476 0.94117647] mean value: 0.8828624256720232 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94736842 0.73684211 0.94736842 0.84210526 0.89473684 0.94736842 0.94736842 0.84210526 0.88888889] mean value: 0.8994152046783626 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.81578947 0.89473684 0.86842105 0.86842105 0.92105263 0.89473684 0.78216374 0.91812865] mean value: 0.8884502923976607 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.9 0.66666667 0.81818182 0.76190476 0.77272727 0.85714286 0.81818182 0.66666667 0.84210526] mean value: 0.8053577124629756 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.82 Accuracy on Blind test: 0.91 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00968361 0.00963068 0.00972104 0.00968266 0.00966477 0.00967169 0.00958872 0.00959301 0.0098207 0.00972056] mean value: 0.009677743911743164 key: score_time value: [0.00886869 0.0086751 0.00869846 0.00869846 0.00874829 0.00864029 0.00864649 0.00878 0.00873041 0.00866818] mean value: 0.008715438842773437 key: test_mcc value: [0.37047929 0.78947368 0.42640143 0.68803296 0.47633051 0.58218174 0.42640143 0.47633051 0.29618896 0.62280702] mean value: 0.5154627531777716 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.68421053 0.89473684 0.71052632 0.84210526 0.73684211 0.78947368 0.71052632 0.73684211 0.64864865 0.81081081] mean value: 0.7564722617354196 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.89473684 0.68571429 0.83333333 0.72222222 0.8 0.68571429 0.75 0.66666667 0.81081081] mean value: 0.7515865113233534 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.70588235 0.89473684 0.75 0.88235294 0.76470588 0.76190476 0.75 0.71428571 0.65 0.78947368] mean value: 0.7663342178976854 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.63157895 0.89473684 0.63157895 0.78947368 0.68421053 0.84210526 0.63157895 0.78947368 0.68421053 0.83333333] mean value: 0.7412280701754386 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.68421053 0.89473684 0.71052632 0.84210526 0.73684211 0.78947368 0.71052632 0.73684211 0.64766082 0.81140351] mean value: 0.7564327485380117 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.80952381 0.52173913 0.71428571 0.56521739 0.66666667 0.52173913 0.6 0.5 0.68181818] mean value: 0.6080990024468285 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.43 Accuracy on Blind test: 0.71 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.52656865 1.51507688 1.49541473 1.58690763 1.57518911 1.50592113 1.52353764 1.52296948 1.51906967 1.57698035] mean value: 1.5347635269165039 key: score_time value: [0.09216976 0.09152102 0.09123611 0.09885144 0.09371042 0.09946156 0.09564042 0.09658599 0.09777331 0.09974384] mean value: 0.09566938877105713 key: test_mcc value: [1. 0.89973541 0.78947368 0.89473684 0.84327404 0.84327404 1. 0.9486833 0.78362573 0.89181287] mean value: 0.8894615917123104 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.94736842 0.89473684 0.94736842 0.92105263 0.92105263 1. 0.97368421 0.89189189 0.94594595] mean value: 0.9443100995732574 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.94444444 0.89473684 0.94736842 0.92307692 0.91891892 1. 0.97435897 0.89473684 0.94444444] mean value: 0.9442085810506863 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.89473684 0.94736842 0.9 0.94444444 1. 0.95 0.89473684 0.94444444] mean value: 0.9475730994152046 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.89473684 0.89473684 0.94736842 0.94736842 0.89473684 1. 1. 0.89473684 0.94444444] mean value: 0.941812865497076 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.94736842 0.89473684 0.94736842 0.92105263 0.92105263 1. 0.97368421 0.89181287 0.94590643] mean value: 0.9442982456140351 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.89473684 0.80952381 0.9 0.85714286 0.85 1. 0.95 0.80952381 0.89473684] mean value: 0.8965664160401002 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.9 Accuracy on Blind test: 0.95 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.948488 0.93976998 0.88425088 0.91976881 0.97527862 0.93516684 0.89031529 0.95844722 0.9312284 0.9260869 ] mean value: 0.930880093574524 key: score_time value: [0.17965937 0.26049066 0.24061608 0.1728282 0.16420078 0.18590951 0.26020312 0.22205067 0.26080751 0.14662576] mean value: 0.20933916568756103 key: test_mcc value: [1. 0.89973541 0.73786479 0.89473684 0.84327404 0.89973541 0.89973541 0.89973541 0.73020842 0.89181287] mean value: 0.8696838599499482 key: train_mcc value: [0.95300713 0.94720632 0.95884012 0.95294118 0.95897286 0.95884012 0.95300713 0.96477265 0.97653939 0.95896113] mean value: 0.9583088027940004 key: test_accuracy value: [1. 0.94736842 0.86842105 0.94736842 0.92105263 0.94736842 0.94736842 0.94736842 0.86486486 0.94594595] mean value: 0.9337126600284494 key: train_accuracy value: [0.97647059 0.97352941 0.97941176 0.97647059 0.97941176 0.97941176 0.97647059 0.98235294 0.98826979 0.97947214] mean value: 0.9791271347248577 key: test_fscore value: [1. 0.94444444 0.86486486 0.94736842 0.92307692 0.94444444 0.94444444 0.95 0.87179487 0.94444444] mean value: 0.933488285856707 key: train_fscore value: [0.97633136 0.97329377 0.97935103 0.97647059 0.97922849 0.97947214 0.97633136 0.98224852 0.98823529 0.97947214] mean value: 0.9790434694122674 key: test_precision value: [1. 1. 0.88888889 0.94736842 0.9 1. 1. 0.9047619 0.85 0.94444444] mean value: 0.9435463659147869 /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: train_precision value: [0.98214286 0.98203593 0.98224852 0.97647059 0.98802395 0.97660819 0.98214286 0.98809524 0.98823529 0.98235294] mean value: 0.9828356363994447 key: test_recall value: [1. 0.89473684 0.84210526 0.94736842 0.94736842 0.89473684 0.89473684 1. 0.89473684 0.94444444] mean value: 0.9260233918128655 key: train_recall value: [0.97058824 0.96470588 0.97647059 0.97647059 0.97058824 0.98235294 0.97058824 0.97647059 0.98823529 0.97660819] mean value: 0.9753078775369797 key: test_roc_auc value: [1. 0.94736842 0.86842105 0.94736842 0.92105263 0.94736842 0.94736842 0.94736842 0.86403509 0.94590643] mean value: 0.933625730994152 key: train_roc_auc value: [0.97647059 0.97352941 0.97941176 0.97647059 0.97941176 0.97941176 0.97647059 0.98235294 0.98826969 0.97948056] mean value: 0.9791279669762643 key: test_jcc value: [1. 0.89473684 0.76190476 0.9 0.85714286 0.89473684 0.89473684 0.9047619 0.77272727 0.89473684] mean value: 0.877548416495785 key: train_jcc value: [0.95375723 0.94797688 0.95953757 0.95402299 0.95930233 0.95977011 0.95375723 0.96511628 0.97674419 0.95977011] mean value: 0.9589754910822583 MCC on Blind test: 0.85 Accuracy on Blind test: 0.92 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02467036 0.00961518 0.00966334 0.00970984 0.00982165 0.00965047 0.00962496 0.00963306 0.00963211 0.00960302] mean value: 0.011162400245666504 key: score_time value: [0.01448631 0.00879812 0.00893068 0.0087347 0.0088098 0.0087955 0.00874925 0.00880694 0.00874519 0.00879836] mean value: 0.009365487098693847 key: test_mcc value: [0.52704628 0.68421053 0.68421053 0.79388419 0.78947368 0.52704628 0.78947368 0.63960215 0.48078072 0.57857577] mean value: 0.6494303793257087 key: train_mcc value: [0.74738216 0.73530684 0.77092175 0.74199852 0.72941176 0.74163853 0.72354193 0.77092175 0.75366357 0.73705515] mean value: 0.745184196259884 key: test_accuracy value: [0.76315789 0.84210526 0.84210526 0.89473684 0.89473684 0.76315789 0.89473684 0.81578947 0.72972973 0.78378378] mean value: 0.8224039829302987 key: train_accuracy value: [0.87352941 0.86764706 0.88529412 0.87058824 0.86470588 0.87058824 0.86176471 0.88529412 0.87683284 0.86803519] mean value: 0.8724279799896498 key: test_fscore value: [0.76923077 0.84210526 0.84210526 0.9 0.89473684 0.76923077 0.89473684 0.82926829 0.77272727 0.75 ] mean value: 0.8264141314398054 key: train_fscore value: [0.87536232 0.86803519 0.88695652 0.87356322 0.86470588 0.87283237 0.86135693 0.88695652 0.87647059 0.87179487] mean value: 0.8738034415804177 key: test_precision value: [0.75 0.84210526 0.84210526 0.85714286 0.89473684 0.75 0.89473684 0.77272727 0.68 0.85714286] mean value: 0.8140697197539303 key: train_precision value: [0.86285714 0.86549708 0.87428571 0.85393258 0.86470588 0.85795455 0.86390533 0.87428571 0.87647059 0.85 ] mean value: 0.8643894573208194 key: test_recall value: [0.78947368 0.84210526 0.84210526 0.94736842 0.89473684 0.78947368 0.89473684 0.89473684 0.89473684 0.66666667] mean value: 0.8456140350877193 key: train_recall value: [0.88823529 0.87058824 0.9 0.89411765 0.86470588 0.88823529 0.85882353 0.9 0.87647059 0.89473684] mean value: 0.8835913312693499 key: test_roc_auc value: [0.76315789 0.84210526 0.84210526 0.89473684 0.89473684 0.76315789 0.89473684 0.81578947 0.7251462 0.78070175] mean value: 0.8216374269005848 key: train_roc_auc value: [0.87352941 0.86764706 0.88529412 0.87058824 0.86470588 0.87058824 0.86176471 0.88529412 0.87683179 0.86795666] mean value: 0.8724200206398349 key: test_jcc value: [0.625 0.72727273 0.72727273 0.81818182 0.80952381 0.625 0.80952381 0.70833333 0.62962963 0.6 ] mean value: 0.7079737854737855 key: train_jcc value: [0.77835052 0.76683938 0.796875 0.7755102 0.76165803 0.77435897 0.75647668 0.796875 0.78010471 0.77272727] mean value: 0.7759775771937931 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.2068522 0.1708672 0.06113815 0.06603909 0.06120896 0.05981898 0.05868864 0.29174948 0.05496788 0.05988955] mean value: 0.10912201404571534 key: score_time value: [0.01325941 0.01139426 0.01159883 0.01139307 0.01082778 0.01058769 0.01076388 0.01153779 0.01074386 0.01062775] mean value: 0.011273431777954101 key: test_mcc value: [1. 1. 1. 0.9486833 0.89473684 0.9486833 0.84327404 0.9486833 0.78362573 0.89181287] mean value: 0.92594993754596 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 1. 0.97368421 0.94736842 0.97368421 0.92105263 0.97368421 0.89189189 0.94594595] mean value: 0.9627311522048364 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 1. 0.97435897 0.94736842 0.97297297 0.92307692 0.97435897 0.89473684 0.94444444] mean value: 0.9631317552370184 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 0.95 0.94736842 1. 0.9 0.95 0.89473684 0.94444444] mean value: 0.9586549707602339 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 0.94736842 0.94736842 0.94736842 1. 0.89473684 0.94444444] mean value: 0.9681286549707602 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 1. 0.97368421 0.94736842 0.97368421 0.92105263 0.97368421 0.89181287 0.94590643] mean value: 0.962719298245614 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 1. 0.95 0.9 0.94736842 0.85714286 0.95 0.80952381 0.89473684] mean value: 0.9308771929824561 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04102659 0.08542848 0.07878828 0.07048774 0.07999039 0.06906962 0.06940484 0.05672765 0.0792861 0.05792046] mean value: 0.06881301403045655 key: score_time value: [0.0229919 0.02294731 0.02151322 0.01808548 0.02362418 0.02200913 0.01912975 0.01265907 0.01236296 0.01703215] mean value: 0.019235515594482423 key: test_mcc value: [0.80757285 0.9486833 0.79388419 0.84327404 0.73786479 0.48454371 0.73786479 0.73786479 0.51461988 0.78362573] mean value: 0.7389798067892077 key: train_mcc value: [0.92941176 0.94124161 0.95294118 0.92941176 0.95294118 0.94707521 0.94124161 0.9353103 0.94722901 0.95896113] mean value: 0.9435764748091761 key: test_accuracy value: [0.89473684 0.97368421 0.89473684 0.92105263 0.86842105 0.73684211 0.86842105 0.86842105 0.75675676 0.89189189] mean value: 0.8674964438122333 key: train_accuracy value: [0.96470588 0.97058824 0.97647059 0.96470588 0.97647059 0.97352941 0.97058824 0.96764706 0.97360704 0.97947214] mean value: 0.9717785061238572 key: test_fscore value: [0.9047619 0.97435897 0.88888889 0.92307692 0.86486486 0.70588235 0.87179487 0.87179487 0.75675676 0.88888889] mean value: 0.8651069298128121 key: train_fscore value: [0.96470588 0.9704142 0.97647059 0.96470588 0.97647059 0.97360704 0.9704142 0.96755162 0.97345133 0.97947214] mean value: 0.9717263472281472 key: test_precision value: [0.82608696 0.95 0.94117647 0.9 0.88888889 0.8 0.85 0.85 0.77777778 0.88888889] mean value: 0.867281898266553 key: train_precision value: [0.96470588 0.97619048 0.97647059 0.96470588 0.97647059 0.97076023 0.97619048 0.9704142 0.97633136 0.98235294] mean value: 0.97345926307822 key: test_recall value: [1. 1. 0.84210526 0.94736842 0.84210526 0.63157895 0.89473684 0.89473684 0.73684211 0.88888889] mean value: 0.8678362573099415 key: train_recall value: [0.96470588 0.96470588 0.97647059 0.96470588 0.97647059 0.97647059 0.96470588 0.96470588 0.97058824 0.97660819] mean value: 0.9700137598899209 key: test_roc_auc value: [0.89473684 0.97368421 0.89473684 0.92105263 0.86842105 0.73684211 0.86842105 0.86842105 0.75730994 0.89181287] mean value: 0.8675438596491228 key: train_roc_auc value: [0.96470588 0.97058824 0.97647059 0.96470588 0.97647059 0.97352941 0.97058824 0.96764706 0.97359821 0.97948056] mean value: 0.9717784657722739 key: test_jcc value: [0.82608696 0.95 0.8 0.85714286 0.76190476 0.54545455 0.77272727 0.77272727 0.60869565 0.8 ] mean value: 0.7694739318652362 key: train_jcc value: [0.93181818 0.94252874 0.95402299 0.93181818 0.95402299 0.94857143 0.94252874 0.93714286 0.94827586 0.95977011] mean value: 0.9450500074638005 MCC on Blind test: 0.75 Accuracy on Blind test: 0.88 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01010537 0.01241589 0.01016307 0.00971317 0.01059246 0.01026273 0.0100174 0.00931168 0.00948501 0.00963616] mean value: 0.010170292854309083 key: score_time value: [0.00914979 0.00916171 0.00918078 0.00903225 0.00939584 0.00868344 0.009166 0.00891471 0.00897503 0.00866079] mean value: 0.009032034873962402 key: test_mcc value: [0.59222009 0.63960215 0.63245553 0.63960215 0.68421053 0.63245553 0.59222009 0.73786479 0.4670794 0.69007214] mean value: 0.6307782394614956 key: train_mcc value: [0.66563935 0.61817134 0.70059418 0.65349541 0.66610178 0.72946225 0.60715472 0.75894171 0.71355814 0.67276567] mean value: 0.6785884546131096 key: test_accuracy value: [0.78947368 0.81578947 0.81578947 0.81578947 0.84210526 0.81578947 0.78947368 0.86842105 0.72972973 0.83783784] mean value: 0.8120199146514936 key: train_accuracy value: [0.83235294 0.80882353 0.85 0.82647059 0.83235294 0.86470588 0.80294118 0.87941176 0.85630499 0.83577713] mean value: 0.8389140934966361 key: test_fscore value: [0.76470588 0.8 0.81081081 0.8 0.84210526 0.81081081 0.76470588 0.87179487 0.76190476 0.8125 ] mean value: 0.8039338283185032 key: train_fscore value: [0.82779456 0.8048048 0.84684685 0.82282282 0.82674772 0.86390533 0.79635258 0.87833828 0.85196375 0.8313253 ] mean value: 0.8350901992163299 key: test_precision value: [0.86666667 0.875 0.83333333 0.875 0.84210526 0.83333333 0.86666667 0.85 0.69565217 0.92857143] mean value: 0.8466328865642367 key: train_precision value: [0.85093168 0.82208589 0.86503067 0.8404908 0.85534591 0.86904762 0.82389937 0.88622754 0.8757764 0.85714286] mean value: 0.8545978740616875 key: test_recall value: [0.68421053 0.73684211 0.78947368 0.73684211 0.84210526 0.78947368 0.68421053 0.89473684 0.84210526 0.72222222] mean value: 0.7722222222222223 key: train_recall value: [0.80588235 0.78823529 0.82941176 0.80588235 0.8 0.85882353 0.77058824 0.87058824 0.82941176 0.80701754] mean value: 0.8165841073271414 key: test_roc_auc value: [0.78947368 0.81578947 0.81578947 0.81578947 0.84210526 0.81578947 0.78947368 0.86842105 0.72660819 0.83479532] mean value: 0.8114035087719298 key: train_roc_auc value: [0.83235294 0.80882353 0.85 0.82647059 0.83235294 0.86470588 0.80294118 0.87941176 0.85622635 0.83586171] mean value: 0.8389146886824905 key: test_jcc value: [0.61904762 0.66666667 0.68181818 0.66666667 0.72727273 0.68181818 0.61904762 0.77272727 0.61538462 0.68421053] mean value: 0.673466007676534 key: train_jcc value: [0.70618557 0.67336683 0.734375 0.69897959 0.70466321 0.76041667 0.66161616 0.78306878 0.74210526 0.71134021] mean value: 0.7176117286148205 MCC on Blind test: 0.77 Accuracy on Blind test: 0.89 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01501966 0.02042222 0.01662898 0.02123141 0.01934409 0.0200212 0.01753616 0.01804042 0.01837301 0.01642108] mean value: 0.018303823471069337 key: score_time value: [0.0092628 0.0112505 0.01110458 0.01176238 0.01192117 0.01195502 0.01184368 0.01198912 0.01202178 0.01190639] mean value: 0.011501741409301759 key: test_mcc value: [0.9486833 0.89973541 0.38729833 0.84327404 0.67936622 0.56613852 0.29277002 0.79388419 0.56934383 0.62525715] mean value: 0.6605751008798381 key: train_mcc value: [0.87660709 0.8617507 0.57735027 0.91190671 0.81150267 0.73854895 0.42491829 0.83159022 0.93562485 0.84815135] mean value: 0.7817951100491364 key: test_accuracy value: [0.97368421 0.94736842 0.65789474 0.92105263 0.81578947 0.76315789 0.57894737 0.89473684 0.78378378 0.78378378] mean value: 0.8120199146514936 key: train_accuracy value: [0.93823529 0.92941176 0.75 0.95588235 0.89705882 0.85294118 0.65294118 0.90882353 0.96774194 0.92082111] mean value: 0.8773857167500432 key: test_fscore value: [0.97435897 0.95 0.73469388 0.92307692 0.84444444 0.70967742 0.7037037 0.9 0.8 0.71428571] mean value: 0.825424105677562 key: train_fscore value: [0.93877551 0.93220339 0.8 0.95626822 0.90666667 0.82758621 0.74235808 0.89967638 0.96735905 0.91588785] mean value: 0.8886781350091697 key: test_precision value: [0.95 0.9047619 0.6 0.9 0.73076923 0.91666667 0.54285714 0.85714286 0.76190476 1. ] mean value: 0.8164102564102564 key: train_precision value: [0.93063584 0.89673913 0.66666667 0.94797688 0.82926829 1. 0.59027778 1. 0.9760479 0.98 ] mean value: 0.8817612488516776 key: test_recall value: [1. 1. 0.94736842 0.94736842 1. 0.57894737 1. 0.94736842 0.84210526 0.55555556] mean value: 0.8818713450292397 key: train_recall value: [0.94705882 0.97058824 1. 0.96470588 1. 0.70588235 1. 0.81764706 0.95882353 0.85964912] mean value: 0.9224355005159959 key: test_roc_auc value: [0.97368421 0.94736842 0.65789474 0.92105263 0.81578947 0.76315789 0.57894737 0.89473684 0.78216374 0.77777778] mean value: 0.8112573099415205 key: train_roc_auc value: [0.93823529 0.92941176 0.75 0.95588235 0.89705882 0.85294118 0.65294118 0.90882353 0.96771586 0.92100103] mean value: 0.8774011007911937 key: test_jcc value: [0.95 0.9047619 0.58064516 0.85714286 0.73076923 0.55 0.54285714 0.81818182 0.66666667 0.55555556] mean value: 0.7156580337225499 key: train_jcc value: [0.88461538 0.87301587 0.66666667 0.91620112 0.82926829 0.70588235 0.59027778 0.81764706 0.93678161 0.84482759] mean value: 0.8065183719244069 MCC on Blind test: 0.79 Accuracy on Blind test: 0.89 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01763225 0.0185833 0.01747012 0.01765466 0.01662111 0.01753116 0.01757503 0.01850629 0.01743722 0.01835084] mean value: 0.017736196517944336 key: score_time value: [0.01220918 0.01224709 0.0121901 0.01216316 0.01189256 0.01195359 0.01198053 0.01190829 0.01229215 0.01211333] mean value: 0.012094998359680175 key: test_mcc value: [0.80757285 0.89973541 0.61017022 0.84327404 0.79388419 0.63960215 0.79388419 0.78947368 0.62807634 0.78764146] mean value: 0.7593314528036907 key: train_mcc value: [0.8028464 0.8452381 0.67431767 0.90076395 0.87721456 0.90688708 0.91178048 0.82150888 0.87394751 0.91280274] mean value: 0.8527307362889015 key: test_accuracy value: [0.89473684 0.94736842 0.78947368 0.92105263 0.89473684 0.81578947 0.89473684 0.89473684 0.81081081 0.89189189] mean value: 0.8755334281650071 key: train_accuracy value: [0.89705882 0.91764706 0.81470588 0.95 0.93823529 0.95294118 0.95588235 0.90294118 0.93548387 0.95601173] mean value: 0.9220907365878903 key: test_fscore value: [0.9047619 0.94444444 0.81818182 0.92307692 0.88888889 0.8 0.88888889 0.89473684 0.82926829 0.88235294] mean value: 0.8774600944207529 key: train_fscore value: [0.90410959 0.91082803 0.84289277 0.94894895 0.93693694 0.95180723 0.95575221 0.89250814 0.93785311 0.95522388] mean value: 0.9236860841053656 key: test_precision value: [0.82608696 1. 0.72 0.9 0.94117647 0.875 0.94117647 0.89473684 0.77272727 0.9375 ] mean value: 0.8808404012530746 key: train_precision value: [0.84615385 0.99305556 0.73160173 0.96932515 0.95705521 0.97530864 0.95857988 1. 0.90217391 0.97560976] mean value: 0.9308863694182445 key: test_recall value: [1. 0.89473684 0.94736842 0.94736842 0.84210526 0.73684211 0.84210526 0.89473684 0.89473684 0.83333333] mean value: 0.8833333333333333 key: train_recall value: [0.97058824 0.84117647 0.99411765 0.92941176 0.91764706 0.92941176 0.95294118 0.80588235 0.97647059 0.93567251] mean value: 0.9253319573443413 key: test_roc_auc value: [0.89473684 0.94736842 0.78947368 0.92105263 0.89473684 0.81578947 0.89473684 0.89473684 0.80847953 0.89035088] mean value: 0.8751461988304093 key: train_roc_auc value: [0.89705882 0.91764706 0.81470588 0.95 0.93823529 0.95294118 0.95588235 0.90294118 0.93560372 0.95607155] mean value: 0.922108703130375 key: test_jcc value: [0.82608696 0.89473684 0.69230769 0.85714286 0.8 0.66666667 0.8 0.80952381 0.70833333 0.78947368] mean value: 0.7844271841811887 key: train_jcc value: [0.825 0.83625731 0.72844828 0.90285714 0.88135593 0.90804598 0.91525424 0.80588235 0.88297872 0.91428571] mean value: 0.8600365665794898 MCC on Blind test: 0.81 Accuracy on Blind test: 0.9 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.16849089 0.14875078 0.15005994 0.15339446 0.15740991 0.15145826 0.14945006 0.15879703 0.16036916 0.15662646] mean value: 0.155480694770813 key: score_time value: [0.0154891 0.01526761 0.01554847 0.01697516 0.01604891 0.01547098 0.01667619 0.01671481 0.01670885 0.01569343] mean value: 0.016059350967407227 key: test_mcc value: [1. 1. 1. 0.9486833 0.9486833 0.9486833 0.89973541 0.89473684 0.73099415 0.89181287] mean value: 0.9263329164643102 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 1. 0.97368421 0.97368421 0.97368421 0.94736842 0.94736842 0.86486486 0.94594595] mean value: 0.9626600284495022 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 1. 0.97435897 0.97297297 0.97297297 0.95 0.94736842 0.86486486 0.94444444] mean value: 0.9626982650666861 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 0.95 1. 1. 0.9047619 0.94736842 0.88888889 0.94444444] mean value: 0.9635463659147869 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 0.94736842 0.94736842 1. 0.94736842 0.84210526 0.94444444] mean value: 0.9628654970760233 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 1. 0.97368421 0.97368421 0.97368421 0.94736842 0.94736842 0.86549708 0.94590643] mean value: 0.9627192982456141 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 1. 0.95 0.94736842 0.94736842 0.9047619 0.9 0.76190476 0.89473684] mean value: 0.9306140350877192 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.95 Accuracy on Blind test: 0.97 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.05852747 0.04814076 0.04541254 0.04895949 0.05033207 0.04772234 0.06385469 0.04760981 0.04737306 0.05188274] mean value: 0.0509814977645874 key: score_time value: [0.02652955 0.02458501 0.02486563 0.02379942 0.02537274 0.02120161 0.02789068 0.02503252 0.02294707 0.02493882] mean value: 0.02471630573272705 key: test_mcc value: [1. 1. 1. 0.9486833 0.89473684 0.89973541 0.9486833 0.9486833 0.73099415 0.94736842] mean value: 0.9318884720198657 key: train_mcc value: [1. 0.99413485 1. 1. 0.98830369 0.99413485 0.98823529 0.98830369 0.99415185 0.98833809] mean value: 0.9935602308976983 key: test_accuracy value: [1. 1. 1. 0.97368421 0.94736842 0.94736842 0.97368421 0.97368421 0.86486486 0.97297297] mean value: 0.9653627311522048 key: train_accuracy value: [1. 0.99705882 1. 1. 0.99411765 0.99705882 0.99411765 0.99411765 0.99706745 0.9941349 ] mean value: 0.996767293427635 key: test_fscore value: [1. 1. 1. 0.97435897 0.94736842 0.94444444 0.97435897 0.97435897 0.86486486 0.97297297] mean value: 0.9652727626411837 key: train_fscore value: [1. 0.99705015 1. 1. 0.99408284 0.99705015 0.99411765 0.99408284 0.99705015 0.99411765] mean value: 0.9967551417068896 key: test_precision value: [1. 1. 1. 0.95 0.94736842 1. 0.95 0.95 0.88888889 0.94736842] mean value: 0.9633625730994152 key: train_precision value: [1. 1. 1. 1. 1. 1. 0.99411765 1. 1. 1. ] mean value: 0.9994117647058823 key: test_recall value: [1. 1. 1. 1. 0.94736842 0.89473684 1. 1. 0.84210526 1. ] mean value: 0.968421052631579 key: train_recall value: [1. 0.99411765 1. 1. 0.98823529 0.99411765 0.99411765 0.98823529 0.99411765 0.98830409] mean value: 0.994124527003784 key: test_roc_auc value: [1. 1. 1. 0.97368421 0.94736842 0.94736842 0.97368421 0.97368421 0.86549708 0.97368421] mean value: 0.9654970760233919 key: train_roc_auc value: [1. 0.99705882 1. 1. 0.99411765 0.99705882 0.99411765 0.99411765 0.99705882 0.99415205] mean value: 0.9967681458548332 key: test_jcc value: [1. 1. 1. 0.95 0.9 0.89473684 0.95 0.95 0.76190476 0.94736842] mean value: 0.9354010025062657 key: train_jcc value: [1. 0.99411765 1. 1. 0.98823529 0.99411765 0.98830409 0.98823529 0.99411765 0.98830409] mean value: 0.9935431716546268 MCC on Blind test: 0.9 Accuracy on Blind test: 0.95 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.08551073 0.10424566 0.10746527 0.11186576 0.08251929 0.10721231 0.0956862 0.1061132 0.16514015 0.17688584] mean value: 0.11426444053649902 key: score_time value: [0.02205372 0.02442288 0.02611351 0.02609515 0.02227306 0.02218318 0.022053 0.02277589 0.03644013 0.04053092] mean value: 0.026494145393371582 key: test_mcc value: [0.58218174 0.68421053 0.42640143 0.68803296 0.59222009 0.68803296 0.68803296 0.68803296 0.35558302 0.69007214] mean value: 0.6082800795061165 key: train_mcc value: [0.99413485 0.99413485 0.99413485 1. 1. 0.99413485 0.99413485 0.99413485 0.99415185 0.99415205] mean value: 0.9953112973615207 key: test_accuracy value: [0.78947368 0.84210526 0.71052632 0.84210526 0.78947368 0.84210526 0.84210526 0.84210526 0.67567568 0.83783784] mean value: 0.8013513513513513 key: train_accuracy value: [0.99705882 0.99705882 0.99705882 1. 1. 0.99705882 0.99705882 0.99705882 0.99706745 0.99706745] mean value: 0.9976487838537175 key: test_fscore value: [0.77777778 0.84210526 0.68571429 0.83333333 0.80952381 0.83333333 0.85 0.83333333 0.71428571 0.8125 ] mean value: 0.7991906850459483 key: train_fscore value: [0.99705015 0.99705015 0.99705015 1. 1. 0.99705015 0.99705015 0.99705015 0.99705015 0.99706745] mean value: 0.9976418481128729 key: test_precision value: [0.82352941 0.84210526 0.75 0.88235294 0.73913043 0.88235294 0.80952381 0.88235294 0.65217391 0.92857143] mean value: 0.8192093084373337 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.73684211 0.84210526 0.63157895 0.78947368 0.89473684 0.78947368 0.89473684 0.78947368 0.78947368 0.72222222] mean value: 0.7880116959064327 key: train_recall value: [0.99411765 0.99411765 0.99411765 1. 1. 0.99411765 0.99411765 0.99411765 0.99411765 0.99415205] mean value: 0.9952975576195391 key: test_roc_auc value: [0.78947368 0.84210526 0.71052632 0.84210526 0.78947368 0.84210526 0.84210526 0.84210526 0.67251462 0.83479532] mean value: 0.8007309941520468 key: train_roc_auc value: [0.99705882 0.99705882 0.99705882 1. 1. 0.99705882 0.99705882 0.99705882 0.99705882 0.99707602] mean value: 0.9976487788097695 key: test_jcc value: [0.63636364 0.72727273 0.52173913 0.71428571 0.68 0.71428571 0.73913043 0.71428571 0.55555556 0.68421053] mean value: 0.6687129153582243 key: train_jcc value: [0.99411765 0.99411765 0.99411765 1. 1. 0.99411765 0.99411765 0.99411765 0.99411765 0.99415205] mean value: 0.9952975576195391 MCC on Blind test: 0.57 Accuracy on Blind test: 0.78 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.57054996 0.5652597 0.55597973 0.57494354 0.56049299 0.54836679 0.56327939 0.55440116 0.55499816 0.55580854] mean value: 0.5604079961776733 key: score_time value: [0.00960946 0.01006937 0.0094645 0.00990939 0.00955105 0.00955939 0.00990939 0.00954437 0.0094986 0.00949216] mean value: 0.009660768508911132 key: test_mcc value: [1. 1. 0.9486833 0.9486833 0.89473684 0.89973541 0.89973541 0.9486833 0.78362573 0.89181287] mean value: 0.9215696154432907 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 0.97368421 0.97368421 0.94736842 0.94736842 0.94736842 0.97368421 0.89189189 0.94594595] mean value: 0.960099573257468 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 0.97435897 0.97435897 0.94736842 0.94444444 0.95 0.97435897 0.89473684 0.94444444] mean value: 0.9604071075123707 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.95 0.95 0.94736842 1. 0.9047619 0.95 0.89473684 0.94444444] mean value: 0.9541311612364244 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 0.94736842 0.89473684 1. 1. 0.89473684 0.94444444] mean value: 0.9681286549707602 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 0.97368421 0.97368421 0.94736842 0.94736842 0.94736842 0.97368421 0.89181287 0.94590643] mean value: 0.9600877192982457 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 0.95 0.95 0.9 0.89473684 0.9047619 0.95 0.80952381 0.89473684] mean value: 0.9253759398496241 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02796578 0.02674651 0.02773929 0.02755022 0.02779508 0.02704573 0.0277493 0.03268075 0.02809811 0.02789998] mean value: 0.028127074241638184 key: score_time value: [0.01260114 0.0186615 0.01520371 0.01710987 0.01517677 0.01661873 0.01516271 0.02039289 0.01575422 0.01550627] mean value: 0.01621878147125244 key: test_mcc value: [0.52704628 0.26462806 0.21320072 0.68803296 0.63245553 0.26462806 0.16151457 0.37686733 0.18980224 0.6754386 ] mean value: 0.3993614351469908 key: train_mcc value: [0.94838881 0.95884012 0.70087664 0.9653073 0.88852332 0.86751214 0.86751214 0.79170339 0.87096663 0.95366475] mean value: 0.8813295248742146 key: test_accuracy value: [0.76315789 0.63157895 0.60526316 0.84210526 0.81578947 0.63157895 0.57894737 0.68421053 0.59459459 0.83783784] mean value: 0.6985064011379801 key: train_accuracy value: [0.97352941 0.97941176 0.82941176 0.98235294 0.94117647 0.92941176 0.92941176 0.88529412 0.93548387 0.97653959] mean value: 0.9362023460410557 key: test_fscore value: [0.75675676 0.65 0.57142857 0.83333333 0.81081081 0.61111111 0.52941176 0.64705882 0.65116279 0.83333333] mean value: 0.6894407295706886 key: train_fscore value: [0.97280967 0.97935103 0.79432624 0.98203593 0.9375 0.92405063 0.92405063 0.87043189 0.93529412 0.97701149] mean value: 0.9296861640810983 key: test_precision value: [0.77777778 0.61904762 0.625 0.88235294 0.83333333 0.64705882 0.6 0.73333333 0.58333333 0.83333333] mean value: 0.7134570494864613 key: train_precision value: [1. 0.98224852 1. 1. 1. 1. 1. 1. 0.93529412 0.96045198] mean value: 0.9877994615758248 key: test_recall value: [0.73684211 0.68421053 0.52631579 0.78947368 0.78947368 0.57894737 0.47368421 0.57894737 0.73684211 0.83333333] mean value: 0.6728070175438596 key: train_recall value: [0.94705882 0.97647059 0.65882353 0.96470588 0.88235294 0.85882353 0.85882353 0.77058824 0.93529412 0.99415205] mean value: 0.8847093223254214 key: test_roc_auc value: [0.76315789 0.63157895 0.60526316 0.84210526 0.81578947 0.63157895 0.57894737 0.68421053 0.59064327 0.8377193 ] mean value: 0.6980994152046783 key: train_roc_auc value: [0.97352941 0.97941176 0.82941176 0.98235294 0.94117647 0.92941176 0.92941176 0.88529412 0.93548332 0.97648779] mean value: 0.9361971104231166 key: test_jcc value: [0.60869565 0.48148148 0.4 0.71428571 0.68181818 0.44 0.36 0.47826087 0.48275862 0.71428571] mean value: 0.5361586234299878 key: train_jcc value: [0.94705882 0.95953757 0.65882353 0.96470588 0.88235294 0.85882353 0.85882353 0.77058824 0.87845304 0.95505618] mean value: 0.8734223261291885 MCC on Blind test: 0.41 Accuracy on Blind test: 0.71 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.01517224 0.0148356 0.01910758 0.02715397 0.03688359 0.02013373 0.03677201 0.03099537 0.03356433 0.03359127] mean value: 0.026820969581604005 key: score_time value: [0.01220894 0.01221657 0.01221132 0.02378225 0.02123022 0.02143621 0.02273417 0.02497792 0.022928 0.02345252] mean value: 0.019717812538146973 key: test_mcc value: [0.89473684 0.9486833 0.78947368 0.84327404 0.84327404 0.58218174 0.84327404 0.79388419 0.56725146 0.78764146] mean value: 0.7893674798967042 key: train_mcc value: [0.87064849 0.89411765 0.90588235 0.87684993 0.89417953 0.90014017 0.87660709 0.89417953 0.90043693 0.90030617] mean value: 0.8913347841716839 key: test_accuracy value: [0.94736842 0.97368421 0.89473684 0.92105263 0.92105263 0.78947368 0.92105263 0.89473684 0.78378378 0.89189189] mean value: 0.8938833570412518 key: train_accuracy value: [0.93529412 0.94705882 0.95294118 0.93823529 0.94705882 0.95 0.93823529 0.94705882 0.95014663 0.95014663] mean value: 0.9456175608073141 key: test_fscore value: [0.94736842 0.97435897 0.89473684 0.92307692 0.92307692 0.77777778 0.91891892 0.9 0.78947368 0.88235294] mean value: 0.8931141405754409 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:136: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:139: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.93567251 0.94705882 0.95294118 0.93913043 0.94736842 0.95043732 0.93877551 0.94736842 0.95043732 0.95043732] mean value: 0.9459627255064605 key: test_precision value: [0.94736842 0.95 0.89473684 0.9 0.9 0.82352941 0.94444444 0.85714286 0.78947368 0.9375 ] mean value: 0.8944195660720429 key: train_precision value: [0.93023256 0.94705882 0.95294118 0.92571429 0.94186047 0.94219653 0.93063584 0.94186047 0.94219653 0.94767442] mean value: 0.9402371094425134 key: test_recall value: [0.94736842 1. 0.89473684 0.94736842 0.94736842 0.73684211 0.89473684 0.94736842 0.78947368 0.83333333] mean value: 0.893859649122807 key: train_recall value: [0.94117647 0.94705882 0.95294118 0.95294118 0.95294118 0.95882353 0.94705882 0.95294118 0.95882353 0.95321637] mean value: 0.9517922256621947 key: test_roc_auc value: [0.94736842 0.97368421 0.89473684 0.92105263 0.92105263 0.78947368 0.92105263 0.89473684 0.78362573 0.89035088] mean value: 0.8937134502923977 key: train_roc_auc value: [0.93529412 0.94705882 0.95294118 0.93823529 0.94705882 0.95 0.93823529 0.94705882 0.950172 0.9501376 ] mean value: 0.9456191950464397 key: test_jcc value: [0.9 0.95 0.80952381 0.85714286 0.85714286 0.63636364 0.85 0.81818182 0.65217391 0.78947368] mean value: 0.8120002575608983 key: train_jcc value: [0.87912088 0.89944134 0.91011236 0.8852459 0.9 0.90555556 0.88461538 0.9 0.90555556 0.90555556] mean value: 0.897520253237496 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.1303339 0.28677583 0.31174827 0.25746155 0.25486231 0.268327 0.14300919 0.26858115 0.38862586 0.40423918] mean value: 0.27139642238616946 key: score_time value: [0.01270962 0.0216949 0.02374864 0.02220631 0.02106261 0.03258085 0.01293516 0.02181482 0.02210927 0.02351737] mean value: 0.021437954902648926 key: test_mcc value: [0.89473684 0.9486833 0.78947368 0.84327404 0.84327404 0.63245553 0.84327404 0.79388419 0.56725146 0.78764146] mean value: 0.7943948594573259 key: train_mcc value: [0.87064849 0.89411765 0.90588235 0.87684993 0.89417953 0.9353103 0.87660709 0.92947609 0.90043693 0.90030617] mean value: 0.898381453059501 key: test_accuracy value: [0.94736842 0.97368421 0.89473684 0.92105263 0.92105263 0.81578947 0.92105263 0.89473684 0.78378378 0.89189189] mean value: 0.8965149359886202 key: train_accuracy value: [0.93529412 0.94705882 0.95294118 0.93823529 0.94705882 0.96764706 0.93823529 0.96470588 0.95014663 0.95014663] mean value: 0.94914697257202 key: test_fscore value: [0.94736842 0.97435897 0.89473684 0.92307692 0.92307692 0.81081081 0.91891892 0.9 0.78947368 0.88235294] mean value: 0.8964174438787442 key: train_fscore value: [0.93567251 0.94705882 0.95294118 0.93913043 0.94736842 0.96755162 0.93877551 0.96449704 0.95043732 0.95043732] mean value: 0.9493870180066716 key: test_precision value: [0.94736842 0.95 0.89473684 0.9 0.9 0.83333333 0.94444444 0.85714286 0.78947368 0.9375 ] mean value: 0.8953999582289056 key: train_precision value: [0.93023256 0.94705882 0.95294118 0.92571429 0.94186047 0.9704142 0.93063584 0.9702381 0.94219653 0.94767442] mean value: 0.9458966393938475 key: test_recall value: [0.94736842 1. 0.89473684 0.94736842 0.94736842 0.78947368 0.89473684 0.94736842 0.78947368 0.83333333] mean value: 0.8991228070175439 key: train_recall value: [0.94117647 0.94705882 0.95294118 0.95294118 0.95294118 0.96470588 0.94705882 0.95882353 0.95882353 0.95321637] mean value: 0.95296869625043 key: test_roc_auc value: [0.94736842 0.97368421 0.89473684 0.92105263 0.92105263 0.81578947 0.92105263 0.89473684 0.78362573 0.89035088] mean value: 0.896345029239766 key: train_roc_auc value: [0.93529412 0.94705882 0.95294118 0.93823529 0.94705882 0.96764706 0.93823529 0.96470588 0.950172 0.9501376 ] mean value: 0.9491486068111455 key: test_jcc value: [0.9 0.95 0.80952381 0.85714286 0.85714286 0.68181818 0.85 0.81818182 0.65217391 0.78947368] mean value: 0.8165457121063529 key: train_jcc value: [0.87912088 0.89944134 0.91011236 0.8852459 0.9 0.93714286 0.88461538 0.93142857 0.90555556 0.90555556] mean value: 0.9038218405390832 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.027807 0.0355041 0.06297231 0.05780268 0.05370164 0.03470182 0.03526163 0.03333521 0.0357399 0.03369379] mean value: 0.0410520076751709 key: score_time value: [0.01207423 0.01532435 0.01223469 0.01207042 0.0144453 0.01446915 0.0145359 0.01214051 0.01207852 0.01209593] mean value: 0.01314690113067627 key: test_mcc value: [0.89473684 0.9486833 0.73786479 0.84327404 0.73786479 0.48454371 0.89473684 0.79388419 0.62807634 0.78764146] mean value: 0.77513062978033 key: train_mcc value: [0.85295593 0.86472084 0.87660709 0.85888297 0.85882353 0.86472084 0.87058824 0.87064849 0.86511868 0.86511404] mean value: 0.8648180657385829 key: test_accuracy value: [0.94736842 0.97368421 0.86842105 0.92105263 0.86842105 0.73684211 0.94736842 0.89473684 0.81081081 0.89189189] mean value: 0.8860597439544808 key: train_accuracy value: [0.92647059 0.93235294 0.93823529 0.92941176 0.92941176 0.93235294 0.93529412 0.93529412 0.93255132 0.93255132] mean value: 0.9323926168707952 key: test_fscore value: [0.94736842 0.97435897 0.86486486 0.92307692 0.86486486 0.70588235 0.94736842 0.9 0.82926829 0.88235294] mean value: 0.8839406056071464 key: train_fscore value: [0.92668622 0.93255132 0.93877551 0.92982456 0.92941176 0.93255132 0.93529412 0.93491124 0.93255132 0.93294461] mean value: 0.9325501978931156 key: test_precision value: [0.94736842 0.95 0.88888889 0.9 0.88888889 0.8 0.94736842 0.85714286 0.77272727 0.9375 ] mean value: 0.8889884749753171 key: train_precision value: [0.92397661 0.92982456 0.93063584 0.9244186 0.92941176 0.92982456 0.93529412 0.94047619 0.92982456 0.93023256] mean value: 0.9303919366167779 key: test_recall value: [0.94736842 1. 0.84210526 0.94736842 0.84210526 0.63157895 0.94736842 0.94736842 0.89473684 0.83333333] mean value: 0.8833333333333333 key: train_recall value: [0.92941176 0.93529412 0.94705882 0.93529412 0.92941176 0.93529412 0.93529412 0.92941176 0.93529412 0.93567251] mean value: 0.9347437220502236 key: test_roc_auc value: [0.94736842 0.97368421 0.86842105 0.92105263 0.86842105 0.73684211 0.94736842 0.89473684 0.80847953 0.89035088] mean value: 0.8856725146198831 key: train_roc_auc value: [0.92647059 0.93235294 0.93823529 0.92941176 0.92941176 0.93235294 0.93529412 0.93529412 0.93255934 0.93254214] mean value: 0.9323925008599931 key: test_jcc value: [0.9 0.95 0.76190476 0.85714286 0.76190476 0.54545455 0.9 0.81818182 0.70833333 0.78947368] mean value: 0.7992395762132605 key: train_jcc value: [0.86338798 0.87362637 0.88461538 0.86885246 0.86813187 0.87362637 0.87845304 0.87777778 0.87362637 0.87431694] mean value: 0.8736414567127365 MCC on Blind test: 0.83 Accuracy on Blind test: 0.91 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [1.12819099 0.96782017 0.88282394 1.06483197 0.87350464 1.21616435 1.23634672 1.01481581 1.04923725 1.17672253] mean value: 1.0610458374023437 key: score_time value: [0.01469541 0.01223922 0.01531339 0.01517987 0.01223707 0.01528811 0.01213527 0.01546454 0.0202651 0.015342 ] mean value: 0.014815998077392579 key: test_mcc value: [0.89973541 0.89973541 0.73786479 0.9486833 0.84327404 0.53300179 0.84327404 0.79388419 0.51461988 0.78362573] mean value: 0.7797698583492705 key: train_mcc value: [0.97653817 0.89417953 1. 0.976741 0.88235294 0.90588235 0.98236994 0.98250594 0.99415185 0.98833809] mean value: 0.9583059813818373 key: test_accuracy value: [0.94736842 0.94736842 0.86842105 0.97368421 0.92105263 0.76315789 0.92105263 0.89473684 0.75675676 0.89189189] mean value: 0.8885490753911807 key: train_accuracy value: [0.98823529 0.94705882 1. 0.98823529 0.94117647 0.95294118 0.99117647 0.99117647 0.99706745 0.9941349 ] mean value: 0.9791202346041056 key: test_fscore value: [0.95 0.95 0.87179487 0.97297297 0.92307692 0.74285714 0.92307692 0.9 0.75675676 0.88888889] mean value: 0.887942447942448 key: train_fscore value: [0.98816568 0.94674556 1. 0.98809524 0.94117647 0.95294118 0.99115044 0.99109792 0.99705015 0.99411765] mean value: 0.9790540287635601 key: test_precision value: [0.9047619 0.9047619 0.85 1. 0.9 0.8125 0.9 0.85714286 0.77777778 0.88888889] mean value: 0.8795833333333334 key: train_precision value: [0.99404762 0.95238095 1. 1. 0.94117647 0.95294118 0.99408284 1. 1. 1. ] mean value: 0.9834629058724081 key: test_recall value: [1. 1. 0.89473684 0.94736842 0.94736842 0.68421053 0.94736842 0.94736842 0.73684211 0.88888889] mean value: 0.8994152046783626 key: train_recall value: [0.98235294 0.94117647 1. 0.97647059 0.94117647 0.95294118 0.98823529 0.98235294 0.99411765 0.98830409] mean value: 0.9747127622979016 key: test_roc_auc value: [0.94736842 0.94736842 0.86842105 0.97368421 0.92105263 0.76315789 0.92105263 0.89473684 0.75730994 0.89181287] mean value: 0.8885964912280702 key: train_roc_auc value: [0.98823529 0.94705882 1. 0.98823529 0.94117647 0.95294118 0.99117647 0.99117647 0.99705882 0.99415205] mean value: 0.9791210870313037 key: test_jcc value: [0.9047619 0.9047619 0.77272727 0.94736842 0.85714286 0.59090909 0.85714286 0.81818182 0.60869565 0.8 ] mean value: 0.806169177885425 key: train_jcc value: [0.97660819 0.8988764 1. 0.97647059 0.88888889 0.91011236 0.98245614 0.98235294 0.99411765 0.98830409] mean value: 0.9598187250457052 MCC on Blind test: 0.76 Accuracy on Blind test: 0.88 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01373482 0.01149917 0.01022553 0.00962138 0.00941873 0.01020408 0.00948644 0.00971484 0.0105443 0.00974584] mean value: 0.010419511795043945 key: score_time value: [0.01228786 0.00936007 0.00903177 0.00898194 0.00892377 0.00910449 0.00882721 0.0090754 0.00997901 0.00893044] mean value: 0.009450197219848633 key: test_mcc value: [0.68803296 0.69989647 0.69989647 0.79388419 0.57894737 0.59222009 0.47633051 0.73786479 0.57184997 0.64287856] mean value: 0.6481801382126267 key: train_mcc value: [0.64994387 0.6215412 0.63334622 0.65705784 0.64172131 0.6871247 0.65360504 0.66133552 0.66975134 0.65909576] mean value: 0.6534522792161372 key: test_accuracy value: [0.84210526 0.84210526 0.84210526 0.89473684 0.78947368 0.78947368 0.73684211 0.86842105 0.78378378 0.81081081] mean value: 0.8199857752489331 key: train_accuracy value: [0.82352941 0.80882353 0.80588235 0.82647059 0.81764706 0.84117647 0.82352941 0.82647059 0.83284457 0.82697947] mean value: 0.8233353458685527 key: test_fscore value: [0.83333333 0.82352941 0.82352941 0.88888889 0.78947368 0.76470588 0.72222222 0.87179487 0.77777778 0.77419355] mean value: 0.806944903249707 key: train_fscore value: [0.81481481 0.79750779 0.77702703 0.81619938 0.80379747 0.83125 0.81012658 0.8115016 0.82242991 0.81619938] mean value: 0.8100853938516973 key: test_precision value: [0.88235294 0.93333333 0.93333333 0.94117647 0.78947368 0.86666667 0.76470588 0.85 0.82352941 0.92307692] mean value: 0.8707648646503136 key: train_precision value: [0.85714286 0.84768212 0.91269841 0.86754967 0.86986301 0.88666667 0.87671233 0.88811189 0.87417219 0.87333333] mean value: 0.8753932473928845 key: test_recall value: [0.78947368 0.73684211 0.73684211 0.84210526 0.78947368 0.68421053 0.68421053 0.89473684 0.73684211 0.66666667] mean value: 0.756140350877193 key: train_recall value: [0.77647059 0.75294118 0.67647059 0.77058824 0.74705882 0.78235294 0.75294118 0.74705882 0.77647059 0.76608187] mean value: 0.75484348125215 key: test_roc_auc value: [0.84210526 0.84210526 0.84210526 0.89473684 0.78947368 0.78947368 0.73684211 0.86842105 0.78508772 0.80701754] mean value: 0.8197368421052632 key: train_roc_auc value: [0.82352941 0.80882353 0.80588235 0.82647059 0.81764706 0.84117647 0.82352941 0.82647059 0.83267974 0.82715858] mean value: 0.8233367733058135 key: test_jcc value: [0.71428571 0.7 0.7 0.8 0.65217391 0.61904762 0.56521739 0.77272727 0.63636364 0.63157895] mean value: 0.6791394494140489 key: train_jcc value: [0.6875 0.66321244 0.63535912 0.68947368 0.67195767 0.71122995 0.68085106 0.6827957 0.6984127 0.68947368] mean value: 0.6810265999325266 MCC on Blind test: 0.68 Accuracy on Blind test: 0.84 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01158476 0.01068711 0.0107584 0.00968242 0.01254201 0.01016307 0.01010203 0.00989175 0.00978208 0.01023674] mean value: 0.010543036460876464 key: score_time value: [0.01002502 0.00914407 0.00887299 0.00956154 0.01029301 0.0097959 0.00908971 0.0090239 0.00896478 0.00972867] mean value: 0.009449958801269531 key: test_mcc value: [0.47368421 0.68421053 0.68421053 0.79388419 0.78947368 0.47368421 0.78947368 0.63960215 0.48078072 0.62807634] mean value: 0.6437080232004303 key: train_mcc value: [0.73561236 0.70588235 0.75314969 0.72986649 0.70593121 0.73561236 0.71769673 0.75314969 0.73607623 0.71966354] mean value: 0.7292640648226028 key: test_accuracy value: [0.73684211 0.84210526 0.84210526 0.89473684 0.89473684 0.73684211 0.89473684 0.81578947 0.72972973 0.81081081] mean value: 0.8198435277382645 key: train_accuracy value: [0.86764706 0.85294118 0.87647059 0.86470588 0.85294118 0.86764706 0.85882353 0.87647059 0.86803519 0.85923754] mean value: 0.8644919786096257 key: test_fscore value: [0.73684211 0.84210526 0.84210526 0.9 0.89473684 0.73684211 0.89473684 0.82926829 0.77272727 0.78787879] mean value: 0.8237242774341619 key: train_fscore value: [0.86956522 0.85294118 0.87790698 0.86705202 0.85380117 0.86956522 0.85964912 0.87790698 0.86725664 0.86363636] mean value: 0.8659280881065122 key: test_precision value: [0.73684211 0.84210526 0.84210526 0.85714286 0.89473684 0.73684211 0.89473684 0.77272727 0.68 0.86666667] mean value: 0.8123905217589428 key: train_precision value: [0.85714286 0.85294118 0.86781609 0.85227273 0.84883721 0.85714286 0.85465116 0.86781609 0.86982249 0.83977901] mean value: 0.8568221664762061 key: test_recall value: [0.73684211 0.84210526 0.84210526 0.94736842 0.89473684 0.73684211 0.89473684 0.89473684 0.89473684 0.72222222] mean value: 0.8406432748538012 key: train_recall value: [0.88235294 0.85294118 0.88823529 0.88235294 0.85882353 0.88235294 0.86470588 0.88823529 0.86470588 0.88888889] mean value: 0.8753594771241829 key: test_roc_auc value: [0.73684211 0.84210526 0.84210526 0.89473684 0.89473684 0.73684211 0.89473684 0.81578947 0.7251462 0.80847953] mean value: 0.8191520467836257 key: train_roc_auc value: [0.86764706 0.85294118 0.87647059 0.86470588 0.85294118 0.86764706 0.85882353 0.87647059 0.86802546 0.85915033] mean value: 0.8644822841417269 key: test_jcc value: [0.58333333 0.72727273 0.72727273 0.81818182 0.80952381 0.58333333 0.80952381 0.70833333 0.62962963 0.65 ] mean value: 0.7046404521404521 key: train_jcc value: [0.76923077 0.74358974 0.78238342 0.76530612 0.74489796 0.76923077 0.75384615 0.78238342 0.765625 0.76 ] mean value: 0.7636493356908327 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00920463 0.01011729 0.01018023 0.01018381 0.0103085 0.01037335 0.01010799 0.01031637 0.01011348 0.01027107] mean value: 0.010117673873901367 key: score_time value: [0.01739001 0.01198483 0.01202416 0.01202321 0.0119853 0.01478839 0.01201344 0.01181197 0.01210189 0.01525283] mean value: 0.013137602806091308 key: test_mcc value: [0.52704628 0.57894737 0.31622777 0.52704628 0.43643578 0.36842105 0.37047929 0.21821789 0.40780312 0.75614764] mean value: 0.450677246185986 key: train_mcc value: [0.63533809 0.67063465 0.67657595 0.63547005 0.71207276 0.68853317 0.64710361 0.64723801 0.72491598 0.58359133] mean value: 0.6621473590091169 key: test_accuracy value: [0.76315789 0.78947368 0.65789474 0.76315789 0.71052632 0.68421053 0.68421053 0.60526316 0.7027027 0.86486486] mean value: 0.7225462304409673 key: train_accuracy value: [0.81764706 0.83529412 0.83823529 0.81764706 0.85588235 0.84411765 0.82352941 0.82352941 0.86217009 0.79178886] mean value: 0.8309841297222701 key: test_fscore value: [0.75675676 0.78947368 0.64864865 0.75675676 0.74418605 0.68421053 0.66666667 0.54545455 0.73170732 0.83870968] mean value: 0.7162570625813843 key: train_fscore value: [0.81871345 0.83625731 0.83965015 0.81547619 0.85373134 0.84637681 0.8245614 0.8255814 0.85885886 0.79178886] mean value: 0.8310995765381942 key: test_precision value: [0.77777778 0.78947368 0.66666667 0.77777778 0.66666667 0.68421053 0.70588235 0.64285714 0.68181818 1. ] mean value: 0.7393130777031706 key: train_precision value: [0.81395349 0.83139535 0.83236994 0.8253012 0.86666667 0.83428571 0.81976744 0.81609195 0.87730061 0.79411765] mean value: 0.8311250021616702 key: test_recall value: [0.73684211 0.78947368 0.63157895 0.73684211 0.84210526 0.68421053 0.63157895 0.47368421 0.78947368 0.72222222] mean value: 0.7038011695906432 key: train_recall value: [0.82352941 0.84117647 0.84705882 0.80588235 0.84117647 0.85882353 0.82941176 0.83529412 0.84117647 0.78947368] mean value: 0.8313003095975232 key: test_roc_auc value: [0.76315789 0.78947368 0.65789474 0.76315789 0.71052632 0.68421053 0.68421053 0.60526316 0.7002924 0.86111111] mean value: 0.7219298245614035 key: train_roc_auc value: [0.81764706 0.83529412 0.83823529 0.81764706 0.85588235 0.84411765 0.82352941 0.82352941 0.8621087 0.79179567] mean value: 0.8309786721706227 key: test_jcc value: [0.60869565 0.65217391 0.48 0.60869565 0.59259259 0.52 0.5 0.375 0.57692308 0.72222222] mean value: 0.5636303109129196 key: train_jcc value: [0.69306931 0.71859296 0.72361809 0.68844221 0.74479167 0.73366834 0.70149254 0.7029703 0.75263158 0.65533981] mean value: 0.7114616800753307 MCC on Blind test: 0.5 Accuracy on Blind test: 0.75 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01786804 0.01848149 0.01594734 0.01577377 0.0179441 0.01872945 0.01715374 0.01642799 0.01686239 0.01881289] mean value: 0.017400121688842772 key: score_time value: [0.01160026 0.01157236 0.01054382 0.01053405 0.01060677 0.01129889 0.01175284 0.01054502 0.01147771 0.01128459] mean value: 0.011121630668640137 key: test_mcc value: [0.84327404 0.9486833 0.78947368 0.78947368 0.78947368 0.48454371 0.89473684 0.73786479 0.51793973 0.78764146] mean value: 0.7583104925854314 key: train_mcc value: [0.80005537 0.78823529 0.79413139 0.81766121 0.80005537 0.8058963 0.78236648 0.80005537 0.82404541 0.79483211] mean value: 0.8007334285346163 key: test_accuracy value: [0.92105263 0.97368421 0.89473684 0.89473684 0.89473684 0.73684211 0.94736842 0.86842105 0.75675676 0.89189189] mean value: 0.878022759601707 key: train_accuracy value: [0.9 0.89411765 0.89705882 0.90882353 0.9 0.90294118 0.89117647 0.9 0.91202346 0.8973607 ] mean value: 0.9003501811281698 key: test_fscore value: [0.91891892 0.97435897 0.89473684 0.89473684 0.89473684 0.70588235 0.94736842 0.87179487 0.7804878 0.88235294] mean value: 0.8765374811436882 key: train_fscore value: [0.9005848 0.89411765 0.8973607 0.90909091 0.9005848 0.90322581 0.89085546 0.89940828 0.91176471 0.89855072] mean value: 0.9005543828827779 key: test_precision value: [0.94444444 0.95 0.89473684 0.89473684 0.89473684 0.8 0.94736842 0.85 0.72727273 0.9375 ] mean value: 0.8840796119085592 key: train_precision value: [0.89534884 0.89411765 0.89473684 0.90643275 0.89534884 0.9005848 0.89349112 0.9047619 0.91176471 0.8908046 ] mean value: 0.8987392040048102 key: test_recall value: [0.89473684 1. 0.89473684 0.89473684 0.89473684 0.63157895 0.94736842 0.89473684 0.84210526 0.83333333] mean value: 0.8728070175438596 key: train_recall value: [0.90588235 0.89411765 0.9 0.91176471 0.90588235 0.90588235 0.88823529 0.89411765 0.91176471 0.90643275] mean value: 0.9024079807361541 key: test_roc_auc value: [0.92105263 0.97368421 0.89473684 0.89473684 0.89473684 0.73684211 0.94736842 0.86842105 0.75438596 0.89035088] mean value: 0.8776315789473684 key: train_roc_auc value: [0.9 0.89411765 0.89705882 0.90882353 0.9 0.90294118 0.89117647 0.9 0.9120227 0.89733402] mean value: 0.9003474372205023 key: test_jcc value: [0.85 0.95 0.80952381 0.80952381 0.80952381 0.54545455 0.9 0.77272727 0.64 0.78947368] mean value: 0.7876226930963773 key: train_jcc value: [0.81914894 0.80851064 0.81382979 0.83333333 0.81914894 0.82352941 0.80319149 0.8172043 0.83783784 0.81578947] mean value: 0.8191524144929399 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.49616385 1.30514312 1.44180202 1.37484336 1.37929106 1.37126446 1.30904698 1.39595342 1.31129622 1.29941964] mean value: 1.3684224128723144 key: score_time value: [0.01476836 0.01247644 0.01286674 0.02777433 0.01502228 0.01509333 0.01474261 0.01487732 0.0184629 0.01495028] mean value: 0.016103458404541016 key: test_mcc value: [0.89473684 0.89973541 0.68421053 0.89973541 0.85280287 0.58218174 0.89473684 0.79388419 0.51319869 0.94721815] mean value: 0.7962440656984944 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.94736842 0.84210526 0.94736842 0.92105263 0.78947368 0.94736842 0.89473684 0.75675676 0.97297297] mean value: 0.8966571834992887 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.95 0.84210526 0.94444444 0.92682927 0.77777778 0.94736842 0.9 0.76923077 0.97142857] mean value: 0.8976552936437403 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94736842 0.9047619 0.84210526 1. 0.86363636 0.82352941 0.94736842 0.85714286 0.75 1. ] mean value: 0.8935912642568989 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 1. 0.84210526 0.89473684 1. 0.73684211 0.94736842 0.94736842 0.78947368 0.94444444] mean value: 0.9049707602339181 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.94736842 0.84210526 0.94736842 0.92105263 0.78947368 0.94736842 0.89473684 0.75584795 0.97222222] mean value: 0.8964912280701754 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.9047619 0.72727273 0.89473684 0.86363636 0.63636364 0.9 0.81818182 0.625 0.94444444] mean value: 0.8214397736766158 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02944374 0.0182209 0.01535916 0.01744986 0.01732898 0.01718378 0.01695347 0.01524591 0.01629186 0.01514053] mean value: 0.017861819267272948 key: score_time value: [0.01158214 0.00925207 0.00901937 0.0089376 0.00885868 0.00885081 0.00914764 0.00883889 0.00884986 0.00873947] mean value: 0.009207653999328613 key: test_mcc value: [0.9486833 0.9486833 0.84327404 0.9486833 0.9486833 0.89973541 0.85280287 0.84327404 0.73099415 0.94736842] mean value: 0.8912182126989485 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.97368421 0.92105263 0.97368421 0.97368421 0.94736842 0.92105263 0.92105263 0.86486486 0.97297297] mean value: 0.9443100995732575 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.97297297 0.92307692 0.97435897 0.97435897 0.94444444 0.92682927 0.91891892 0.86486486 0.97297297] mean value: 0.9447157288620703 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 1. 0.9 0.95 0.95 1. 0.86363636 0.94444444 0.88888889 0.94736842] mean value: 0.9394338118022328 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94736842 0.94736842 1. 1. 0.89473684 1. 0.89473684 0.84210526 1. ] mean value: 0.9526315789473684 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.97368421 0.92105263 0.97368421 0.97368421 0.94736842 0.92105263 0.92105263 0.86549708 0.97368421] mean value: 0.9444444444444444 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.94736842 0.85714286 0.95 0.95 0.89473684 0.86363636 0.85 0.76190476 0.94736842] mean value: 0.8972157666894509 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.11228299 0.11245561 0.10948372 0.11234665 0.10924101 0.1089797 0.11292195 0.11369872 0.11021519 0.10727549] mean value: 0.1108901023864746 key: score_time value: [0.01883864 0.01856804 0.01872277 0.01774263 0.01919746 0.01752329 0.01925826 0.01753545 0.01749849 0.0175736 ] mean value: 0.018245863914489745 key: test_mcc value: [0.9486833 0.89473684 0.63960215 0.84327404 0.78947368 0.73786479 0.89473684 0.79388419 0.62170355 1. ] mean value: 0.8163959384712077 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.81578947 0.92105263 0.89473684 0.86842105 0.94736842 0.89473684 0.81081081 1. ] mean value: 0.9073968705547653 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97435897 0.94736842 0.8 0.92307692 0.89473684 0.87179487 0.94736842 0.9 0.82051282 1. ] mean value: 0.9079217273954115 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95 0.94736842 0.875 0.9 0.89473684 0.85 0.94736842 0.85714286 0.8 1. ] mean value: 0.9021616541353383 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94736842 0.73684211 0.94736842 0.89473684 0.89473684 0.94736842 0.94736842 0.84210526 1. ] mean value: 0.9157894736842105 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.81578947 0.92105263 0.89473684 0.86842105 0.94736842 0.89473684 0.80994152 1. ] mean value: 0.9073099415204678 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.95 0.9 0.66666667 0.85714286 0.80952381 0.77272727 0.9 0.81818182 0.69565217 1. ] mean value: 0.8369894598155467 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01023006 0.0107708 0.00978875 0.01090908 0.00982141 0.00973773 0.01041436 0.009727 0.01096225 0.0110054 ] mean value: 0.010336685180664062 key: score_time value: [0.00909853 0.00951576 0.00937533 0.00973797 0.00959563 0.0088625 0.00879812 0.0088625 0.00882602 0.00961709] mean value: 0.009228944778442383 key: test_mcc value: [0.57894737 0.63245553 0.21821789 0.47368421 0.58218174 0.68803296 0.47633051 0.42640143 0.19005848 0.62170355] mean value: 0.48880136757948445 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.78947368 0.81578947 0.60526316 0.73684211 0.78947368 0.84210526 0.73684211 0.71052632 0.59459459 0.81081081] mean value: 0.743172119487909 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.78947368 0.81081081 0.54545455 0.73684211 0.8 0.85 0.72222222 0.73170732 0.59459459 0.8 ] mean value: 0.7381105279629028 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.78947368 0.83333333 0.64285714 0.73684211 0.76190476 0.80952381 0.76470588 0.68181818 0.61111111 0.82352941] mean value: 0.7455099424139672 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.78947368 0.78947368 0.47368421 0.73684211 0.84210526 0.89473684 0.68421053 0.78947368 0.57894737 0.77777778] mean value: 0.735672514619883 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.78947368 0.81578947 0.60526316 0.73684211 0.78947368 0.84210526 0.73684211 0.71052632 0.59502924 0.80994152] mean value: 0.7431286549707602 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.65217391 0.68181818 0.375 0.58333333 0.66666667 0.73913043 0.56521739 0.57692308 0.42307692 0.66666667] mean value: 0.5930006587615283 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.52 Accuracy on Blind test: 0.76 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.55051184 1.54603195 1.53799772 1.54854393 1.5016973 1.51683092 1.5347023 1.53684664 1.5649302 1.55764413] mean value: 1.5395736932754516 key: score_time value: [0.09356093 0.09390473 0.09295797 0.09251022 0.09202743 0.09703946 0.09734011 0.09730768 0.09886241 0.09727573] mean value: 0.09527866840362549 key: test_mcc value: [1. 1. 0.78947368 0.89473684 0.89973541 0.89973541 1. 0.9486833 0.78362573 0.94736842] mean value: 0.9163358798097961 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 0.89473684 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.89189189 0.97297297] mean value: 0.9575391180654338 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 0.89473684 0.94736842 0.95 0.94444444 1. 0.97435897 0.89473684 0.97297297] mean value: 0.9578618497039549 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.89473684 0.94736842 0.9047619 1. 1. 0.95 0.89473684 0.94736842] mean value: 0.9538972431077695 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.89473684 0.94736842 1. 0.89473684 1. 1. 0.89473684 1. ] mean value: 0.9631578947368421 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 0.89473684 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.89181287 0.97368421] mean value: 0.9576023391812866 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 0.80952381 0.9 0.9047619 0.89473684 1. 0.95 0.80952381 0.94736842] mean value: 0.9215914786967419 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.89 Accuracy on Blind test: 0.95 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.91693258 0.89162326 0.93540263 0.97587872 0.90822935 0.87675786 0.95299554 0.92298007 1.00643253 0.8825717 ] mean value: 0.9269804239273072 key: score_time value: [0.22907305 0.24806905 0.17145753 0.21838999 0.27262688 0.16964602 0.24580407 0.27468348 0.24008393 0.2604115 ] mean value: 0.23302454948425294 key: test_mcc value: [1. 0.89973541 0.68803296 0.89473684 0.84327404 0.85280287 0.9486833 0.89973541 0.67849265 0.94736842] mean value: 0.8652861897439335 key: train_mcc value: [0.95884012 0.95884012 0.95884012 0.95294118 0.95300713 0.95884012 0.95884012 0.96477265 0.97653939 0.95896113] mean value: 0.9600422066629911 key: test_accuracy value: [1. 0.94736842 0.84210526 0.94736842 0.92105263 0.92105263 0.97368421 0.94736842 0.83783784 0.97297297] mean value: 0.931081081081081 key: train_accuracy value: [0.97941176 0.97941176 0.97941176 0.97647059 0.97647059 0.97941176 0.97941176 0.98235294 0.98826979 0.97947214] mean value: 0.9800094876660341 key: test_fscore value: [1. 0.94444444 0.83333333 0.94736842 0.92307692 0.91428571 0.97297297 0.95 0.85 0.97297297] mean value: 0.9308454782138993 key: train_fscore value: [0.97935103 0.97935103 0.97935103 0.97647059 0.97633136 0.97947214 0.97935103 0.98224852 0.98823529 0.97947214] mean value: 0.9799634175328182 key: test_precision value: [1. 1. 0.88235294 0.94736842 0.9 1. 1. 0.9047619 0.80952381 0.94736842] mean value: 0.9391375497567448/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: train_precision value: [0.98224852 0.98224852 0.98224852 0.97647059 0.98214286 0.97660819 0.98224852 0.98809524 0.98823529 0.98235294] mean value: 0.9822899188742247 key: test_recall value: [1. 0.89473684 0.78947368 0.94736842 0.94736842 0.84210526 0.94736842 1. 0.89473684 1. ] mean value: 0.9263157894736842 key: train_recall value: [0.97647059 0.97647059 0.97647059 0.97647059 0.97058824 0.98235294 0.97647059 0.97647059 0.98823529 0.97660819] mean value: 0.9776608187134502 key: test_roc_auc value: [1. 0.94736842 0.84210526 0.94736842 0.92105263 0.92105263 0.97368421 0.94736842 0.83625731 0.97368421] mean value: 0.9309941520467836 key: train_roc_auc value: [0.97941176 0.97941176 0.97941176 0.97647059 0.97647059 0.97941176 0.97941176 0.98235294 0.98826969 0.97948056] mean value: 0.9800103199174407 key: test_jcc value: [1. 0.89473684 0.71428571 0.9 0.85714286 0.84210526 0.94736842 0.9047619 0.73913043 0.94736842] mean value: 0.8746899858341506 key: train_jcc value: [0.95953757 0.95953757 0.95953757 0.95402299 0.95375723 0.95977011 0.95953757 0.96511628 0.97674419 0.95977011] mean value: 0.960733119795795 MCC on Blind test: 0.86 Accuracy on Blind test: 0.93 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.010741 0.00969005 0.01051641 0.00985312 0.00964284 0.010144 0.00974083 0.0098412 0.00984383 0.01043248] mean value: 0.010044574737548828 key: score_time value: [0.0090158 0.00899315 0.00946307 0.00896311 0.0089345 0.01059484 0.00890207 0.00879717 0.00891232 0.00898099] mean value: 0.009155702590942384 key: test_mcc value: [0.47368421 0.68421053 0.68421053 0.79388419 0.78947368 0.47368421 0.78947368 0.63960215 0.48078072 0.62807634] mean value: 0.6437080232004303 key: train_mcc value: [0.73561236 0.70588235 0.75314969 0.72986649 0.70593121 0.73561236 0.71769673 0.75314969 0.73607623 0.71966354] mean value: 0.7292640648226028 key: test_accuracy value: [0.73684211 0.84210526 0.84210526 0.89473684 0.89473684 0.73684211 0.89473684 0.81578947 0.72972973 0.81081081] mean value: 0.8198435277382645 key: train_accuracy value: [0.86764706 0.85294118 0.87647059 0.86470588 0.85294118 0.86764706 0.85882353 0.87647059 0.86803519 0.85923754] mean value: 0.8644919786096257 key: test_fscore value: [0.73684211 0.84210526 0.84210526 0.9 0.89473684 0.73684211 0.89473684 0.82926829 0.77272727 0.78787879] mean value: 0.8237242774341619 key: train_fscore value: [0.86956522 0.85294118 0.87790698 0.86705202 0.85380117 0.86956522 0.85964912 0.87790698 0.86725664 0.86363636] mean value: 0.8659280881065122 key: test_precision value: [0.73684211 0.84210526 0.84210526 0.85714286 0.89473684 0.73684211 0.89473684 0.77272727 0.68 0.86666667] mean value: 0.8123905217589428 key: train_precision value: [0.85714286 0.85294118 0.86781609 0.85227273 0.84883721 0.85714286 0.85465116 0.86781609 0.86982249 0.83977901] mean value: 0.8568221664762061 key: test_recall value: [0.73684211 0.84210526 0.84210526 0.94736842 0.89473684 0.73684211 0.89473684 0.89473684 0.89473684 0.72222222] mean value: 0.8406432748538012 key: train_recall value: [0.88235294 0.85294118 0.88823529 0.88235294 0.85882353 0.88235294 0.86470588 0.88823529 0.86470588 0.88888889] mean value: 0.8753594771241829 key: test_roc_auc value: [0.73684211 0.84210526 0.84210526 0.89473684 0.89473684 0.73684211 0.89473684 0.81578947 0.7251462 0.80847953] mean value: 0.8191520467836257 key: train_roc_auc value: [0.86764706 0.85294118 0.87647059 0.86470588 0.85294118 0.86764706 0.85882353 0.87647059 0.86802546 0.85915033] mean value: 0.8644822841417269 key: test_jcc value: [0.58333333 0.72727273 0.72727273 0.81818182 0.80952381 0.58333333 0.80952381 0.70833333 0.62962963 0.65 ] mean value: 0.7046404521404521 key: train_jcc value: [0.76923077 0.74358974 0.78238342 0.76530612 0.74489796 0.76923077 0.75384615 0.78238342 0.765625 0.76 ] mean value: 0.7636493356908327 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.10662818 0.08921003 0.0561254 0.17810702 0.05234981 0.05486059 0.06721592 0.05890155 0.26667285 0.05537868] mean value: 0.0985450029373169 key: score_time value: [0.01744914 0.01164341 0.01065397 0.01203752 0.01097751 0.01060772 0.01059103 0.01130843 0.01121664 0.01066351] mean value: 0.011714887619018555 key: test_mcc value: [1. 1. 1. 0.9486833 0.9486833 0.89973541 0.84327404 0.89473684 0.78362573 0.94736842] mean value: 0.9266107043807079 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 1. 0.97368421 0.97368421 0.94736842 0.92105263 0.94736842 0.89189189 0.97297297] mean value: 0.9628022759601707 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 1. 0.97435897 0.97435897 0.94444444 0.92307692 0.94736842 0.89473684 0.97297297] mean value: 0.9631317552370184 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 0.95 0.95 1. 0.9 0.94736842 0.89473684 0.94736842] mean value: 0.9589473684210527 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 1. 0.89473684 0.94736842 0.94736842 0.89473684 1. ] mean value: 0.968421052631579 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 1. 0.97368421 0.97368421 0.94736842 0.92105263 0.94736842 0.89181287 0.97368421] mean value: 0.9628654970760234 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 1. 0.95 0.95 0.89473684 0.85714286 0.9 0.80952381 0.94736842] mean value: 0.9308771929824561 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04259491 0.05594087 0.03498626 0.03416085 0.03477049 0.06916142 0.06280589 0.03609562 0.05616331 0.07135487] mean value: 0.04980344772338867 key: score_time value: [0.02427268 0.01239538 0.01218939 0.01224589 0.01225781 0.02066946 0.01230192 0.01249409 0.02214265 0.01604795] mean value: 0.015701723098754884 key: test_mcc value: [0.74620251 1. 0.73786479 0.84327404 0.78947368 0.38829014 0.68421053 0.80757285 0.51461988 0.83918129] mean value: 0.7350689707890694 key: train_mcc value: [0.92354539 0.92966915 0.95884012 0.92947609 0.95294118 0.95300713 0.94124161 0.9353103 0.95314596 0.94762566] mean value: 0.9424802586910812 key: test_accuracy value: [0.86842105 1. 0.86842105 0.92105263 0.89473684 0.68421053 0.84210526 0.89473684 0.75675676 0.91891892] mean value: 0.8649359886201992 key: train_accuracy value: [0.96176471 0.96470588 0.97941176 0.96470588 0.97647059 0.97647059 0.97058824 0.96764706 0.97653959 0.97360704] mean value: 0.9711911333448335 key: test_fscore value: [0.87804878 1. 0.86486486 0.92307692 0.89473684 0.625 0.84210526 0.9047619 0.75675676 0.91891892] mean value: 0.8608270254130331 key: train_fscore value: [0.96165192 0.96428571 0.97935103 0.96449704 0.97647059 0.97660819 0.9704142 0.96755162 0.97660819 0.97329377] mean value: 0.9710732260210945 key: test_precision value: [0.81818182 1. 0.88888889 0.9 0.89473684 0.76923077 0.84210526 0.82608696 0.77777778 0.89473684] mean value: 0.8611745157969415 key: train_precision value: [0.96449704 0.97590361 0.98224852 0.9702381 0.97647059 0.97093023 0.97619048 0.9704142 0.97093023 0.98795181] mean value: 0.9745774809780501 key: test_recall value: [0.94736842 1. 0.84210526 0.94736842 0.89473684 0.52631579 0.84210526 1. 0.73684211 0.94444444] mean value: 0.8681286549707602 key: train_recall value: [0.95882353 0.95294118 0.97647059 0.95882353 0.97647059 0.98235294 0.96470588 0.96470588 0.98235294 0.95906433] mean value: 0.9676711386308909 key: test_roc_auc value: [0.86842105 1. 0.86842105 0.92105263 0.89473684 0.68421053 0.84210526 0.89473684 0.75730994 0.91959064] mean value: 0.8650584795321637 key: train_roc_auc value: [0.96176471 0.96470588 0.97941176 0.96470588 0.97647059 0.97647059 0.97058824 0.96764706 0.97655659 0.97364981] mean value: 0.9711971104231166 key: test_jcc value: [0.7826087 1. 0.76190476 0.85714286 0.80952381 0.45454545 0.72727273 0.82608696 0.60869565 0.85 ] mean value: 0.7677780914737437 key: train_jcc value: [0.92613636 0.93103448 0.95953757 0.93142857 0.95402299 0.95428571 0.94252874 0.93714286 0.95428571 0.94797688] mean value: 0.9438379878542824 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.0131712 0.01293206 0.00980425 0.01048255 0.01042271 0.01054835 0.01047897 0.01062369 0.0106225 0.01053381] mean value: 0.01096200942993164 key: score_time value: [0.01183963 0.00915504 0.00976515 0.00946379 0.00945783 0.00950193 0.00948477 0.00947523 0.00913143 0.00954247] mean value: 0.00968172550201416 key: test_mcc value: [0.59222009 0.68803296 0.63245553 0.63960215 0.68421053 0.58218174 0.63960215 0.73786479 0.52960948 0.73821295] mean value: 0.6463992364353213 key: train_mcc value: [0.65923425 0.60639664 0.70632241 0.65322377 0.67175144 0.7236421 0.60766169 0.74707175 0.71317436 0.69522435] mean value: 0.6783702762219075 key: test_accuracy value: [0.78947368 0.84210526 0.81578947 0.81578947 0.84210526 0.78947368 0.81578947 0.86842105 0.75675676 0.86486486] mean value: 0.8200568990042674 key: train_accuracy value: [0.82941176 0.80294118 0.85294118 0.82647059 0.83529412 0.86176471 0.80294118 0.87352941 0.85630499 0.84750733] mean value: 0.8389106434362601 key: test_fscore value: [0.76470588 0.83333333 0.81081081 0.8 0.84210526 0.77777778 0.8 0.87179487 0.79069767 0.84848485] mean value: 0.8139710462131082 key: train_fscore value: [0.82634731 0.7987988 0.8502994 0.8238806 0.83030303 0.86053412 0.79510703 0.87315634 0.85285285 0.84615385] mean value: 0.8357433332161395 key: test_precision value: [0.86666667 0.88235294 0.83333333 0.875 0.84210526 0.82352941 0.875 0.85 0.70833333 0.93333333] mean value: 0.8489654282765737 key: train_precision value: [0.84146341 0.81595092 0.86585366 0.83636364 0.85625 0.86826347 0.82802548 0.87573964 0.87116564 0.85628743] mean value: 0.8515363294832559 key: test_recall value: [0.68421053 0.78947368 0.78947368 0.73684211 0.84210526 0.73684211 0.73684211 0.89473684 0.89473684 0.77777778] mean value: 0.7883040935672514 key: train_recall value: [0.81176471 0.78235294 0.83529412 0.81176471 0.80588235 0.85294118 0.76470588 0.87058824 0.83529412 0.83625731] mean value: 0.8206845545235638 key: test_roc_auc value: [0.78947368 0.84210526 0.81578947 0.81578947 0.84210526 0.78947368 0.81578947 0.86842105 0.75292398 0.8625731 ] mean value: 0.8194444444444444 key: train_roc_auc value: [0.82941176 0.80294118 0.85294118 0.82647059 0.83529412 0.86176471 0.80294118 0.87352941 0.85624355 0.84754042] mean value: 0.8389078087375301 key: test_jcc value: [0.61904762 0.71428571 0.68181818 0.66666667 0.72727273 0.63636364 0.66666667 0.77272727 0.65384615 0.73684211] mean value: 0.6875536743957796 key: train_jcc value: [0.70408163 0.665 0.73958333 0.70050761 0.70984456 0.75520833 0.65989848 0.77486911 0.7434555 0.73333333] mean value: 0.7185781890938955 MCC on Blind test: 0.77 Accuracy on Blind test: 0.89 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01748848 0.01892996 0.01658511 0.01925921 0.01871228 0.01994371 0.01729989 0.01780486 0.0200069 0.01862764] mean value: 0.018465805053710937 key: score_time value: [0.0090909 0.01126218 0.01121306 0.01176643 0.01179862 0.01175475 0.01167011 0.01188803 0.01181984 0.0126853 ] mean value: 0.011494922637939452 key: test_mcc value: [0.76376262 0.89973541 0.38729833 0.78947368 0.84327404 0.52704628 0.29277002 0.79388419 0.62807634 0.75614764] mean value: 0.6681468554833931 key: train_mcc value: [0.8452381 0.79448906 0.5864073 0.88333157 0.89010061 0.90189002 0.40544243 0.84174979 0.88932517 0.89043758] mean value: 0.7928411614629753 key: test_accuracy value: [0.86842105 0.94736842 0.65789474 0.89473684 0.92105263 0.76315789 0.57894737 0.89473684 0.81081081 0.86486486] mean value: 0.820199146514936 key: train_accuracy value: [0.91764706 0.88823529 0.75588235 0.94117647 0.94411765 0.95 0.64117647 0.91470588 0.94428152 0.94428152] mean value: 0.8841504226323961 key: test_fscore value: [0.84848485 0.95 0.73469388 0.89473684 0.91891892 0.76923077 0.7037037 0.9 0.82926829 0.83870968] mean value: 0.8387746930096805 key: train_fscore value: [0.91082803 0.89893617 0.80378251 0.94252874 0.94224924 0.95156695 0.73593074 0.90675241 0.94524496 0.94259819] mean value: 0.8980417920511166 key: test_precision value: [1. 0.9047619 0.6 0.89473684 0.94444444 0.75 0.54285714 0.85714286 0.77272727 1. ] mean value: 0.8266670464038886 key: train_precision value: [0.99305556 0.82038835 0.67193676 0.92134831 0.97484277 0.92265193 0.58219178 1. 0.92655367 0.975 ] mean value: 0.8787969132705697 key: test_recall value: [0.73684211 1. 0.94736842 0.89473684 0.89473684 0.78947368 1. 0.94736842 0.89473684 0.72222222] mean value: 0.882748538011696 key: train_recall value: [0.84117647 0.99411765 1. 0.96470588 0.91176471 0.98235294 1. 0.82941176 0.96470588 0.9122807 ] mean value: 0.9400515995872033 key: test_roc_auc value: [0.86842105 0.94736842 0.65789474 0.89473684 0.92105263 0.76315789 0.57894737 0.89473684 0.80847953 0.86111111] mean value: 0.8195906432748539 key: train_roc_auc value: [0.91764706 0.88823529 0.75588235 0.94117647 0.94411765 0.95 0.64117647 0.91470588 0.94434125 0.94437564] mean value: 0.8841658066735466 key: test_jcc value: [0.73684211 0.9047619 0.58064516 0.80952381 0.85 0.625 0.54285714 0.81818182 0.70833333 0.72222222] mean value: 0.7298367497433711 key: train_jcc value: [0.83625731 0.81642512 0.67193676 0.89130435 0.8908046 0.9076087 0.58219178 0.82941176 0.89617486 0.89142857] mean value: 0.8213543811131508 MCC on Blind test: 0.79 Accuracy on Blind test: 0.89 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01583433 0.01828694 0.01696563 0.02924252 0.03335357 0.01711559 0.01663041 0.01640534 0.01945257 0.01540542] mean value: 0.019869232177734376 key: score_time value: [0.01194024 0.01182508 0.02836466 0.01295829 0.02824974 0.0117662 0.01195216 0.01181483 0.01190305 0.01181436] mean value: 0.015258860588073731 key: test_mcc value: [0.74620251 0.85280287 0.57894737 0.84327404 0.84327404 0.69989647 0.76376262 0.63828474 0.63129316 0.69356297] mean value: 0.7291300780936345 key: train_mcc value: [0.78047467 0.86610667 0.91771057 0.87209836 0.90594505 0.81649658 0.67766324 0.74545617 0.88351945 0.72157164] mean value: 0.8187042409466225 key: test_accuracy value: [0.86842105 0.92105263 0.78947368 0.92105263 0.92105263 0.84210526 0.86842105 0.78947368 0.81081081 0.83783784] mean value: 0.8569701280227596 key: train_accuracy value: [0.88235294 0.92941176 0.95882353 0.93529412 0.95294118 0.9 0.81470588 0.85882353 0.93841642 0.84750733] mean value: 0.901827669484216 key: test_fscore value: [0.87804878 0.91428571 0.78947368 0.92307692 0.92307692 0.85714286 0.88372093 0.82608696 0.8 0.85 ] mean value: 0.8644912769035046 key: train_fscore value: [0.89304813 0.9245283 0.95857988 0.93714286 0.95266272 0.90909091 0.84367246 0.87564767 0.93416928 0.86597938] mean value: 0.909452158542273 key: test_precision value: [0.81818182 1. 0.78947368 0.9 0.9 0.7826087 0.79166667 0.7037037 0.875 0.77272727] mean value: 0.8333361841142162 key: train_precision value: [0.81862745 0.99324324 0.96428571 0.91111111 0.95833333 0.83333333 0.72961373 0.78240741 1. 0.77419355] mean value: 0.8765148875987211 key: test_recall value: [0.94736842 0.84210526 0.78947368 0.94736842 0.94736842 0.94736842 1. 1. 0.73684211 0.94444444] mean value: 0.910233918128655 key: train_recall value: [0.98235294 0.86470588 0.95294118 0.96470588 0.94705882 1. 1. 0.99411765 0.87647059 0.98245614] mean value: 0.9564809081527348 key: test_roc_auc value: [0.86842105 0.92105263 0.78947368 0.92105263 0.92105263 0.84210526 0.86842105 0.78947368 0.8128655 0.84064327] mean value: 0.8574561403508771 key: train_roc_auc value: [0.88235294 0.92941176 0.95882353 0.93529412 0.95294118 0.9 0.81470588 0.85882353 0.93823529 0.84711042] mean value: 0.9017698658410733 key: test_jcc value: [0.7826087 0.84210526 0.65217391 0.85714286 0.85714286 0.75 0.79166667 0.7037037 0.66666667 0.73913043] mean value: 0.7642341057958907 key: train_jcc value: [0.80676329 0.85964912 0.92045455 0.88172043 0.90960452 0.83333333 0.72961373 0.77880184 0.87647059 0.76363636] mean value: 0.8360047765595798 MCC on Blind test: 0.75 Accuracy on Blind test: 0.88 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.17080402 0.1569531 0.15340972 0.15698004 0.15308666 0.15458584 0.14575839 0.14861679 0.15207219 0.14971685] mean value: 0.15419836044311525 key: score_time value: [0.01653957 0.0166285 0.01610899 0.01642632 0.0170877 0.0152936 0.01553082 0.01582837 0.01660395 0.01644444] mean value: 0.01624922752380371 key: test_mcc value: [1. 1. 1. 0.9486833 1. 0.89973541 0.84327404 0.89473684 0.78362573 0.94736842] mean value: 0.9317423745756566 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 1. 0.97368421 1. 0.94736842 0.92105263 0.94736842 0.89189189 0.97297297] mean value: 0.9654338549075391 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 1. 0.97435897 1. 0.94444444 0.92307692 0.94736842 0.89473684 0.97297297] mean value: 0.965695857801121 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 0.95 1. 1. 0.9 0.94736842 0.89473684 0.94736842] mean value: 0.9639473684210527 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 1. 0.89473684 0.94736842 0.94736842 0.89473684 1. ] mean value: 0.968421052631579 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 1. 0.97368421 1. 0.94736842 0.92105263 0.94736842 0.89181287 0.97368421] mean value: 0.9654970760233919 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 1. 0.95 1. 0.89473684 0.85714286 0.9 0.80952381 0.94736842] mean value: 0.9358771929824561 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.05442548 0.06178308 0.06514335 0.05190825 0.05278492 0.05515146 0.04277182 0.05043507 0.06439567 0.04802108] mean value: 0.054682016372680664 key: score_time value: [0.03232479 0.03214812 0.02032709 0.02793241 0.03441763 0.01812482 0.02434039 0.01886606 0.02517366 0.03107882] mean value: 0.026473379135131835 key: test_mcc value: [1. 1. 1. 0.9486833 0.9486833 0.89973541 0.84327404 0.9486833 0.73099415 0.94736842] mean value: 0.9267421920804961 key: train_mcc value: [1. 0.99413485 1. 1. 0.98830369 0.99413485 0.98823529 0.99413485 1. 0.98833809] mean value: 0.9947281618348918 key: test_accuracy value: [1. 1. 1. 0.97368421 0.97368421 0.94736842 0.92105263 0.97368421 0.86486486 0.97297297] mean value: 0.9627311522048364 key: train_accuracy value: [1. 0.99705882 1. 1. 0.99411765 0.99705882 0.99411765 0.99705882 1. 0.9941349 ] mean value: 0.9973546662066586 key: test_fscore value: [1. 1. 1. 0.97435897 0.97435897 0.94444444 0.92307692 0.97435897 0.86486486 0.97297297] mean value: 0.9628436128436129 key: train_fscore value: [1. 0.99705015 1. 1. 0.99408284 0.99705015 0.99411765 0.99705015 1. 0.99411765] mean value: 0.997346857683221 key: test_precision value: [1. 1. 1. 0.95 0.95 1. 0.9 0.95 0.88888889 0.94736842] mean value: 0.958625730994152 key: train_precision value: [1. 1. 1. 1. 1. 1. 0.99411765 1. 1. 1. ] mean value: 0.9994117647058823 key: test_recall value: [1. 1. 1. 1. 1. 0.89473684 0.94736842 1. 0.84210526 1. ] mean value: 0.968421052631579 key: train_recall value: [1. 0.99411765 1. 1. 0.98823529 0.99411765 0.99411765 0.99411765 1. 0.98830409] mean value: 0.9953009975920193 key: test_roc_auc value: [1. 1. 1. 0.97368421 0.97368421 0.94736842 0.92105263 0.97368421 0.86549708 0.97368421] mean value: 0.9628654970760234 key: train_roc_auc value: [1. 0.99705882 1. 1. 0.99411765 0.99705882 0.99411765 0.99705882 1. 0.99415205] mean value: 0.9973563811489509 key: test_jcc value: [1. 1. 1. 0.95 0.95 0.89473684 0.85714286 0.95 0.76190476 0.94736842] mean value: 0.9311152882205513 key: train_jcc value: [1. 0.99411765 1. 1. 0.98823529 0.99411765 0.98830409 0.99411765 1. 0.98830409] mean value: 0.994719642242862 MCC on Blind test: 0.91 Accuracy on Blind test: 0.96 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.08240032 0.13941526 0.11900902 0.09115481 0.0813446 0.11244035 0.08711672 0.18356252 0.14831948 0.12165046] mean value: 0.11664135456085205 key: score_time value: [0.02674294 0.03760934 0.0219245 0.01437283 0.02252603 0.02183795 0.02192736 0.04869914 0.02245545 0.02846289] mean value: 0.02665584087371826 key: test_mcc value: [0.58218174 0.68421053 0.37047929 0.68803296 0.65465367 0.63960215 0.68803296 0.68803296 0.24189738 0.73821295] mean value: 0.5975336578127587 key: train_mcc value: [0.99413485 0.99413485 0.99413485 1. 1. 0.99413485 0.99413485 0.99413485 0.99415185 0.99415205] mean value: 0.9953112973615207 key: test_accuracy value: [0.78947368 0.84210526 0.68421053 0.84210526 0.81578947 0.81578947 0.84210526 0.84210526 0.62162162 0.86486486] mean value: 0.7960170697012802 key: train_accuracy value: [0.99705882 0.99705882 0.99705882 1. 1. 0.99705882 0.99705882 0.99705882 0.99706745 0.99706745] mean value: 0.9976487838537175 key: test_fscore value: [0.77777778 0.84210526 0.66666667 0.83333333 0.8372093 0.8 0.85 0.83333333 0.65 0.84848485] mean value: 0.7938910525079436 key: train_fscore value: [0.99705015 0.99705015 0.99705015 1. 1. 0.99705015 0.99705015 0.99705015 0.99705015 0.99706745] mean value: 0.9976418481128729 key: test_precision value: [0.82352941 0.84210526 0.70588235 0.88235294 0.75 0.875 0.80952381 0.88235294 0.61904762 0.93333333] mean value: 0.812312767212148 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.73684211 0.84210526 0.63157895 0.78947368 0.94736842 0.73684211 0.89473684 0.78947368 0.68421053 0.77777778] mean value: 0.7830409356725146 key: train_recall value: [0.99411765 0.99411765 0.99411765 1. 1. 0.99411765 0.99411765 0.99411765 0.99411765 0.99415205] mean value: 0.9952975576195391 key: test_roc_auc value: [0.78947368 0.84210526 0.68421053 0.84210526 0.81578947 0.81578947 0.84210526 0.84210526 0.61988304 0.8625731 ] mean value: 0.7956140350877193 key: train_roc_auc value: [0.99705882 0.99705882 0.99705882 1. 1. 0.99705882 0.99705882 0.99705882 0.99705882 0.99707602] mean value: 0.9976487788097695 key: test_jcc value: [0.63636364 0.72727273 0.5 0.71428571 0.72 0.66666667 0.73913043 0.71428571 0.48148148 0.73684211] mean value: 0.6636328480401706 key: train_jcc value: [0.99411765 0.99411765 0.99411765 1. 1. 0.99411765 0.99411765 0.99411765 0.99411765 0.99415205] mean value: 0.9952975576195391 MCC on Blind test: 0.54 Accuracy on Blind test: 0.77 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.56569266 0.5649147 0.54991865 0.56822872 0.55563831 0.56164074 0.56107974 0.54614925 0.56099558 0.56145477] mean value: 0.5595713138580323 key: score_time value: [0.0103898 0.00958323 0.00938058 0.00951958 0.00956798 0.01010776 0.00926852 0.00950861 0.01079369 0.01012635] mean value: 0.009824609756469727 key: test_mcc value: [1. 1. 0.9486833 0.9486833 0.9486833 0.89973541 0.89973541 0.9486833 0.78362573 0.94736842] mean value: 0.9325198165933714 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 1. 0.97368421 0.97368421 0.97368421 0.94736842 0.94736842 0.97368421 0.89189189 0.97297297] mean value: 0.9654338549075391 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 1. 0.97435897 0.97435897 0.97435897 0.94444444 0.95 0.97435897 0.89473684 0.97297297] mean value: 0.9659590156958578 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.95 0.95 0.95 1. 0.9047619 0.95 0.89473684 0.94736842] mean value: 0.9546867167919799 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 1. 0.89473684 1. 1. 0.89473684 1. ] mean value: 0.9789473684210527 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 1. 0.97368421 0.97368421 0.97368421 0.94736842 0.94736842 0.97368421 0.89181287 0.97368421] mean value: 0.9654970760233919 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 1. 0.95 0.95 0.95 0.89473684 0.9047619 0.95 0.80952381 0.94736842] mean value: 0.9356390977443609 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02689314 0.02751875 0.02677655 0.0268662 0.02773142 0.03597403 0.02919483 0.02741051 0.02737808 0.02891994] mean value: 0.028466343879699707 key: score_time value: [0.01260877 0.01430941 0.01301813 0.01564336 0.01542616 0.01571608 0.01542115 0.0155189 0.01564002 0.01539946] mean value: 0.014870142936706543 key: test_mcc value: [0.63245553 0.21821789 0.26315789 0.59222009 0.74620251 0.31622777 0.26462806 0.42640143 0.19504453 0.83871328] mean value: 0.4493268986749522 key: train_mcc value: [0.99413485 0.92077472 0.77216846 0.976741 0.98250594 0.91533482 0.9707394 0.93725826 0.89204798 0.98826969] mean value: 0.9349975124716327 key: test_accuracy value: [0.81578947 0.60526316 0.63157895 0.78947368 0.86842105 0.65789474 0.63157895 0.71052632 0.59459459 0.91891892] mean value: 0.7224039829302987 key: train_accuracy value: [0.99705882 0.95882353 0.87352941 0.98823529 0.99117647 0.95588235 0.98529412 0.96764706 0.94428152 0.9941349 ] mean value: 0.9656063481110919 key: test_fscore value: [0.82051282 0.65116279 0.63157895 0.76470588 0.87804878 0.64864865 0.61111111 0.68571429 0.66666667 0.91428571] mean value: 0.7272435647846088 key: train_fscore value: [0.99705015 0.96045198 0.85521886 0.98809524 0.99109792 0.95384615 0.9851632 0.96656535 0.94647887 0.99415205] mean value: 0.9638119769217577 key: test_precision value: [0.8 0.58333333 0.63157895 0.86666667 0.81818182 0.66666667 0.64705882 0.75 0.57692308 0.94117647] mean value: 0.728158580325763 key: train_precision value: [1. 0.92391304 1. 1. 1. 1. 0.99401198 1. 0.90810811 0.99415205] mean value: 0.9820185174417899 key: test_recall value: [0.84210526 0.73684211 0.63157895 0.68421053 0.94736842 0.63157895 0.57894737 0.63157895 0.78947368 0.88888889] mean value: 0.7362573099415204 key: train_recall value: [0.99411765 1. 0.74705882 0.97647059 0.98235294 0.91176471 0.97647059 0.93529412 0.98823529 0.99415205] mean value: 0.9505916752665978 key: test_roc_auc value: [0.81578947 0.60526316 0.63157895 0.78947368 0.86842105 0.65789474 0.63157895 0.71052632 0.58918129 0.91812865] mean value: 0.7217836257309942 key: train_roc_auc value: [0.99705882 0.95882353 0.87352941 0.98823529 0.99117647 0.95588235 0.98529412 0.96764706 0.94441004 0.99413485] mean value: 0.9656191950464397 key: test_jcc value: [0.69565217 0.48275862 0.46153846 0.61904762 0.7826087 0.48 0.44 0.52173913 0.5 0.84210526] mean value: 0.5825449964433631 key: train_jcc value: [0.99411765 0.92391304 0.74705882 0.97647059 0.98235294 0.91176471 0.97076023 0.93529412 0.89839572 0.98837209] mean value: 0.9328499915874191 MCC on Blind test: 0.43 Accuracy on Blind test: 0.71 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.0276432 0.05927157 0.03859544 0.03848791 0.03850269 0.05718088 0.03406334 0.01551056 0.01546884 0.01610494] mean value: 0.03408293724060059 key: score_time value: [0.02301216 0.02319098 0.02459955 0.0241189 0.0216651 0.02239084 0.0124681 0.01247931 0.0128572 0.01285744] mean value: 0.018963956832885744 key: test_mcc value: [0.84327404 0.9486833 0.73786479 0.84327404 0.84327404 0.48454371 0.89473684 0.79388419 0.56725146 0.78764146] mean value: 0.7744427877554028 key: train_mcc value: [0.87064849 0.88241401 0.89417953 0.86484056 0.88825066 0.89417953 0.87058824 0.88235294 0.89442724 0.88269694] mean value: 0.8824578138125448 key: test_accuracy value: [0.92105263 0.97368421 0.86842105 0.92105263 0.92105263 0.73684211 0.94736842 0.89473684 0.78378378 0.89189189] mean value: 0.8859886201991465 key: train_accuracy value: [0.93529412 0.94117647 0.94705882 0.93235294 0.94411765 0.94705882 0.93529412 0.94117647 0.94721408 0.94134897] mean value: 0.9412092461618078 key: test_fscore value: [0.91891892 0.97435897 0.87179487 0.92307692 0.92307692 0.70588235 0.94736842 0.9 0.78947368 0.88235294] mean value: 0.8836304010607416 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:156: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:159: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.93567251 0.9408284 0.94674556 0.93294461 0.9439528 0.94736842 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9412562188544396 key: test_precision value: [0.94444444 0.95 0.85 0.9 0.9 0.8 0.94736842 0.85714286 0.78947368 0.9375 ] mean value: 0.8875929406850459 key: train_precision value: [0.93023256 0.94642857 0.95238095 0.92485549 0.94674556 0.94186047 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9407553480125959 key: test_recall value: [0.89473684 1. 0.89473684 0.94736842 0.94736842 0.63157895 0.94736842 0.94736842 0.78947368 0.83333333] mean value: 0.8833333333333333 key: train_recall value: [0.94117647 0.93529412 0.94117647 0.94117647 0.94117647 0.95294118 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9417991056071551 key: test_roc_auc value: [0.92105263 0.97368421 0.86842105 0.92105263 0.92105263 0.73684211 0.94736842 0.89473684 0.78362573 0.89035088] mean value: 0.8858187134502924 key: train_roc_auc value: [0.93529412 0.94117647 0.94705882 0.93235294 0.94411765 0.94705882 0.93529412 0.94117647 0.94721362 0.94134847] mean value: 0.9412091503267974 key: test_jcc value: [0.85 0.95 0.77272727 0.85714286 0.85714286 0.54545455 0.9 0.81818182 0.65217391 0.78947368] mean value: 0.7992296947903355 key: train_jcc value: [0.87912088 0.88826816 0.8988764 0.87431694 0.89385475 0.9 0.87845304 0.88888889 0.89944134 0.88950276] mean value: 0.8890723159309888 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.27244306 0.33298659 0.28044224 0.39420319 0.54344702 0.35148716 0.35520601 0.30538011 0.34263945 0.27838564] mean value: 0.3456620454788208 key: score_time value: [0.02331948 0.01721334 0.02358603 0.02463841 0.02457571 0.02402425 0.02256465 0.02319074 0.02490997 0.02106881] mean value: 0.022909140586853026 key: test_mcc value: [0.84327404 0.9486833 0.78947368 0.79388419 0.84327404 0.48454371 0.89473684 0.79388419 0.56725146 0.78764146] mean value: 0.7746646917717809 key: train_mcc value: [0.87064849 0.88241401 0.95300713 0.8058963 0.88825066 0.89417953 0.87058824 0.88235294 0.89442724 0.88269694] mean value: 0.8824461478103343 key: test_accuracy value: [0.92105263 0.97368421 0.89473684 0.89473684 0.92105263 0.73684211 0.94736842 0.89473684 0.78378378 0.89189189] mean value: 0.8859886201991465 key: train_accuracy value: [0.93529412 0.94117647 0.97647059 0.90294118 0.94411765 0.94705882 0.93529412 0.94117647 0.94721408 0.94134897] mean value: 0.9412092461618078 key: test_fscore value: [0.91891892 0.97435897 0.89473684 0.9 0.92307692 0.70588235 0.94736842 0.9 0.78947368 0.88235294] mean value: 0.8836169057840885 key: train_fscore value: [0.93567251 0.9408284 0.97633136 0.90265487 0.9439528 0.94736842 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9411858248203606 key: test_precision value: [0.94444444 0.95 0.89473684 0.85714286 0.9 0.8 0.94736842 0.85714286 0.78947368 0.9375 ] mean value: 0.887780910609858 key: train_precision value: [0.93023256 0.94642857 0.98214286 0.90532544 0.94674556 0.94186047 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9417785337345366 key: test_recall value: [0.89473684 1. 0.89473684 0.94736842 0.94736842 0.63157895 0.94736842 0.94736842 0.78947368 0.83333333] mean value: 0.8833333333333333 key: train_recall value: [0.94117647 0.93529412 0.97058824 0.9 0.94117647 0.95294118 0.93529412 0.94117647 0.94705882 0.94152047] mean value: 0.9406226350189199 key: test_roc_auc value: [0.92105263 0.97368421 0.89473684 0.89473684 0.92105263 0.73684211 0.94736842 0.89473684 0.78362573 0.89035088] mean value: 0.8858187134502924 key: train_roc_auc value: [0.93529412 0.94117647 0.97647059 0.90294118 0.94411765 0.94705882 0.93529412 0.94117647 0.94721362 0.94134847] mean value: 0.9412091503267974 key: test_jcc value: [0.85 0.95 0.80952381 0.81818182 0.85714286 0.54545455 0.9 0.81818182 0.65217391 0.78947368] mean value: 0.7990132445738853 key: train_jcc value: [0.87912088 0.88826816 0.95375723 0.82258065 0.89385475 0.9 0.87845304 0.88888889 0.89944134 0.88950276] mean value: 0.8893867685519612 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03672051 0.03351378 0.03409219 0.04177117 0.0275321 0.03296661 0.07960463 0.03673577 0.03128195 0.07671475] mean value: 0.04309334754943848 key: score_time value: [0.01211333 0.01203656 0.01435566 0.01213312 0.01190567 0.01184034 0.0147748 0.02327275 0.01196027 0.01549888] mean value: 0.013989138603210449 key: test_mcc value: [0.94736842 0.83918129 0.63129316 0.78362573 0.78362573 0.73099415 0.80369958 0.83918129 0.72333935 0.83462233] mean value: 0.7916931020080674 key: train_mcc value: [0.85498357 0.87915298 0.89743309 0.86706827 0.88521358 0.87312888 0.86104418 0.86706827 0.85548378 0.87350983] mean value: 0.8714086429236412 key: test_accuracy value: [0.97297297 0.91891892 0.81081081 0.89189189 0.89189189 0.86486486 0.89189189 0.91891892 0.86111111 0.91666667] mean value: 0.893993993993994 key: train_accuracy value: [0.92749245 0.93957704 0.94864048 0.93353474 0.94259819 0.93655589 0.9305136 0.93353474 0.92771084 0.93674699] mean value: 0.9356904961234667 key: test_fscore value: [0.97297297 0.91891892 0.82051282 0.88888889 0.89473684 0.86486486 0.88235294 0.91891892 0.85714286 0.91891892] mean value: 0.8938228944420895 key: train_fscore value: [0.92771084 0.93975904 0.94832827 0.93373494 0.94259819 0.93655589 0.9305136 0.93333333 0.92814371 0.93655589] mean value: 0.9357233697617179 key: test_precision value: [0.94736842 0.89473684 0.76190476 0.88888889 0.89473684 0.88888889 1. 0.94444444 0.88235294 0.89473684] mean value: 0.8998058872671876 key: train_precision value: [0.92771084 0.93975904 0.95705521 0.93373494 0.93975904 0.93373494 0.92771084 0.93333333 0.92261905 0.93939394] mean value: 0.9354811173624463 key: test_recall value: [1. 0.94444444 0.88888889 0.88888889 0.89473684 0.84210526 0.78947368 0.89473684 0.83333333 0.94444444] mean value: 0.8921052631578947 key: train_recall value: [0.92771084 0.93975904 0.93975904 0.93373494 0.94545455 0.93939394 0.93333333 0.93333333 0.93373494 0.93373494] mean value: 0.9359948886454911 key: test_roc_auc value: [0.97368421 0.91959064 0.8128655 0.89181287 0.89181287 0.86549708 0.89473684 0.91959064 0.86111111 0.91666667] mean value: 0.8947368421052632 key: train_roc_auc value: [0.92749179 0.93957649 0.9486674 0.93353414 0.94260679 0.93656444 0.93052209 0.93353414 0.92771084 0.93674699] mean value: 0.9356955093099671 key: test_jcc value: [0.94736842 0.85 0.69565217 0.8 0.80952381 0.76190476 0.78947368 0.85 0.75 0.85 ] mean value: 0.8103922850604772 key: train_jcc value: [0.86516854 0.88636364 0.9017341 0.87570621 0.89142857 0.88068182 0.8700565 0.875 0.86592179 0.88068182] mean value: 0.8792742987101834 MCC on Blind test: 0.83 Accuracy on Blind test: 0.91 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.78824234 0.96631408 0.8446548 1.35734797 1.61382103 1.39275098 1.2790935 1.1367147 1.12887931 0.96179318] mean value: 1.1469611883163453 key: score_time value: [0.01487708 0.01531816 0.01532412 0.01569772 0.0157392 0.01540256 0.01546884 0.01227903 0.02109766 0.01225138] mean value: 0.015345573425292969 key: test_mcc value: [0.94736842 0.62280702 0.57184997 0.78362573 0.83871328 0.78362573 0.94736842 0.89181287 0.83462233 0.83462233] mean value: 0.8056416089443539 key: train_mcc value: [0.90339187 1. 0.9939759 0.90332238 0.98203333 0.90339187 1. 0.98189054 1. 0.99399394] mean value: 0.9661999838560218 key: test_accuracy value: [0.97297297 0.81081081 0.78378378 0.89189189 0.91891892 0.89189189 0.97297297 0.94594595 0.91666667 0.91666667] mean value: 0.9022522522522523 key: train_accuracy value: [0.95166163 1. 0.99697885 0.95166163 0.99093656 0.95166163 1. 0.99093656 1. 0.99698795] mean value: 0.9830824809813271 key: test_fscore value: [0.97297297 0.81081081 0.78947368 0.88888889 0.92307692 0.89473684 0.97297297 0.94736842 0.91428571 0.91891892] mean value: 0.9033506149295623 key: train_fscore value: [0.95151515 1. 0.99697885 0.95180723 0.99082569 0.95180723 1. 0.99088146 1. 0.99697885] mean value: 0.9830794460313929 key: test_precision value: [0.94736842 0.78947368 0.75 0.88888889 0.9 0.89473684 1. 0.94736842 0.94117647 0.89473684] mean value: 0.895374957000344 key: train_precision value: [0.95731707 1. 1. 0.95180723 1. 0.94610778 1. 0.99390244 1. 1. ] mean value: 0.9849134525541923 key: test_recall value: [1. 0.83333333 0.83333333 0.88888889 0.94736842 0.89473684 0.94736842 0.94736842 0.88888889 0.94444444] mean value: 0.9125730994152047 key: train_recall value: [0.94578313 1. 0.9939759 0.95180723 0.98181818 0.95757576 1. 0.98787879 1. 0.9939759 ] mean value: 0.9812814895947426 key: test_roc_auc value: [0.97368421 0.81140351 0.78508772 0.89181287 0.91812865 0.89181287 0.97368421 0.94590643 0.91666667 0.91666667] mean value: 0.902485380116959 key: train_roc_auc value: [0.95167945 1. 0.99698795 0.95166119 0.99090909 0.95167945 1. 0.99092735 1. 0.99698795] mean value: 0.9830832420591457 key: test_jcc value: [0.94736842 0.68181818 0.65217391 0.8 0.85714286 0.80952381 0.94736842 0.9 0.84210526 0.85 ] mean value: 0.8287500866791484 key: train_jcc value: [0.90751445 1. 0.9939759 0.90804598 0.98181818 0.90804598 1. 0.98192771 1. 0.9939759 ] mean value: 0.9675304104780511 MCC on Blind test: 0.69 Accuracy on Blind test: 0.84 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01437283 0.01098561 0.0105083 0.01576018 0.01028442 0.01016164 0.01007247 0.01817203 0.01193213 0.01369476] mean value: 0.01259443759918213 key: score_time value: [0.01252317 0.00955033 0.01525402 0.01239777 0.0093441 0.00946498 0.00936222 0.01498938 0.01097512 0.01218009] mean value: 0.011604118347167968 key: test_mcc value: [0.73099415 0.40469382 0.73099415 0.51319869 0.48981224 0.62280702 0.60308132 0.45906433 0.61977979 0.79772404] mean value: 0.5972149539148133 key: train_mcc value: [0.64442374 0.62134114 0.66921665 0.68999143 0.67585241 0.64541184 0.67034019 0.65820219 0.70948192 0.69643271] mean value: 0.6680694215597842 key: test_accuracy value: [0.86486486 0.7027027 0.86486486 0.75675676 0.72972973 0.81081081 0.78378378 0.72972973 0.80555556 0.88888889] mean value: 0.7937687687687688 key: train_accuracy value: [0.81873112 0.80966767 0.83383686 0.8429003 0.83081571 0.82175227 0.83383686 0.82779456 0.85240964 0.84638554] mean value: 0.8318130528154916 key: test_fscore value: [0.86486486 0.68571429 0.86486486 0.74285714 0.6875 0.81081081 0.75 0.73684211 0.78787879 0.875 ] mean value: 0.7806332862253915 key: train_fscore value: [0.80519481 0.80250784 0.82866044 0.8343949 0.81081081 0.81388013 0.82539683 0.81904762 0.84345048 0.83809524] mean value: 0.8221439081547757 key: test_precision value: [0.84210526 0.70588235 0.84210526 0.76470588 0.84615385 0.83333333 0.92307692 0.73684211 0.86666667 1. ] mean value: 0.8360871636103834 key: train_precision value: [0.87323944 0.83660131 0.85806452 0.88513514 0.91603053 0.84868421 0.86666667 0.86 0.89795918 0.88590604] mean value: 0.8728287030559482 key: test_recall value: [0.88888889 0.66666667 0.88888889 0.72222222 0.57894737 0.78947368 0.63157895 0.73684211 0.72222222 0.77777778] mean value: 0.7403508771929824 key: train_recall value: [0.74698795 0.77108434 0.80120482 0.78915663 0.72727273 0.78181818 0.78787879 0.78181818 0.79518072 0.79518072] mean value: 0.7777583059510771 key: test_roc_auc value: [0.86549708 0.70175439 0.86549708 0.75584795 0.73391813 0.81140351 0.7880117 0.72953216 0.80555556 0.88888889] mean value: 0.7945906432748537 key: train_roc_auc value: [0.81894852 0.80978459 0.83393574 0.84306316 0.83050383 0.82163198 0.83369843 0.82765608 0.85240964 0.84638554] mean value: 0.831801752464403 key: test_jcc value: [0.76190476 0.52173913 0.76190476 0.59090909 0.52380952 0.68181818 0.6 0.58333333 0.65 0.77777778] mean value: 0.6453196561892214 key: train_jcc value: [0.67391304 0.67015707 0.70744681 0.71584699 0.68181818 0.68617021 0.7027027 0.69354839 0.72928177 0.72131148] mean value: 0.6982196642336499 MCC on Blind test: 0.67 Accuracy on Blind test: 0.84 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01818132 0.01185274 0.01171231 0.0114913 0.01783705 0.01429081 0.0138998 0.013304 0.01724982 0.01014686] mean value: 0.013996601104736328 key: score_time value: [0.01515007 0.00999379 0.01022625 0.01021886 0.01386929 0.01219106 0.01203775 0.01533842 0.01011515 0.0094707 ] mean value: 0.011861133575439452 key: test_mcc value: [0.74044197 0.62280702 0.4633451 0.57184997 0.56934383 0.62170355 0.7888597 0.6754386 0.4472136 0.78262379] mean value: 0.628362712085414 key: train_mcc value: [0.73459045 0.71601738 0.79022336 0.74626648 0.74713145 0.68655466 0.70405667 0.72205184 0.77108434 0.76674551] mean value: 0.7384722135344838 key: test_accuracy value: [0.86486486 0.81081081 0.72972973 0.78378378 0.78378378 0.81081081 0.89189189 0.83783784 0.72222222 0.88888889] mean value: 0.8124624624624625 key: train_accuracy value: [0.86706949 0.85800604 0.89425982 0.87311178 0.87311178 0.8429003 0.85196375 0.86102719 0.88554217 0.88253012] mean value: 0.8689522440214028 key: test_fscore value: [0.87179487 0.81081081 0.73684211 0.78947368 0.8 0.82051282 0.88888889 0.84210526 0.70588235 0.88235294] mean value: 0.8148663738756617 key: train_fscore value: [0.86982249 0.85885886 0.89795918 0.8742515 0.87573964 0.83850932 0.85285285 0.86060606 0.88554217 0.88629738] mean value: 0.8700439444712924 key: test_precision value: [0.80952381 0.78947368 0.7 0.75 0.76190476 0.8 0.94117647 0.84210526 0.75 0.9375 ] mean value: 0.8081683989385228 key: train_precision value: [0.85465116 0.85628743 0.8700565 0.86904762 0.85549133 0.85987261 0.8452381 0.86060606 0.88554217 0.85875706] mean value: 0.8615550031773642 key: test_recall value: [0.94444444 0.83333333 0.77777778 0.83333333 0.84210526 0.84210526 0.84210526 0.84210526 0.66666667 0.83333333] mean value: 0.8257309941520468 key: train_recall value: [0.88554217 0.86144578 0.92771084 0.87951807 0.8969697 0.81818182 0.86060606 0.86060606 0.88554217 0.91566265] mean value: 0.8791785323110625 key: test_roc_auc value: [0.86695906 0.81140351 0.73099415 0.78508772 0.78216374 0.80994152 0.89327485 0.8377193 0.72222222 0.88888889] mean value: 0.8128654970760234 key: train_roc_auc value: [0.86701351 0.85799562 0.89415845 0.87309237 0.87318364 0.84282585 0.85198978 0.86102592 0.88554217 0.88253012] mean value: 0.8689357429718876 key: test_jcc value: [0.77272727 0.68181818 0.58333333 0.65217391 0.66666667 0.69565217 0.8 0.72727273 0.54545455 0.78947368] mean value: 0.6914572498439775 key: train_jcc value: [0.76963351 0.75263158 0.81481481 0.77659574 0.77894737 0.72192513 0.7434555 0.75531915 0.79459459 0.79581152] mean value: 0.77037289076449 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00932622 0.01532483 0.01057291 0.01194143 0.01086783 0.01041532 0.01080632 0.01105237 0.01040554 0.01048732] mean value: 0.011120009422302245 key: score_time value: [0.01911092 0.01833034 0.021945 0.01794815 0.01830196 0.01822591 0.01857352 0.01737976 0.01754856 0.01728535] mean value: 0.018464946746826173 key: test_mcc value: [0.46019501 0.63129316 0.07739329 0.51461988 0.35558302 0.51461988 0.57184997 0.25301653 0.3354102 0.50709255] mean value: 0.42210735008522005 key: train_mcc value: [0.69212796 0.66871448 0.71631061 0.65571257 0.67976195 0.69792238 0.64957827 0.65571257 0.67513995 0.66308388] mean value: 0.6754064609832358 key: test_accuracy value: [0.72972973 0.81081081 0.54054054 0.75675676 0.67567568 0.75675676 0.78378378 0.62162162 0.66666667 0.75 ] mean value: 0.7092342342342343 key: train_accuracy value: [0.84592145 0.83383686 0.85800604 0.82779456 0.83987915 0.8489426 0.82477341 0.82779456 0.8373494 0.8313253 ] mean value: 0.8375623339278564 key: test_fscore value: [0.70588235 0.82051282 0.48484848 0.75675676 0.71428571 0.75675676 0.77777778 0.58823529 0.64705882 0.76923077] mean value: 0.7021345550757315 key: train_fscore value: [0.84866469 0.82972136 0.86053412 0.82674772 0.83890578 0.84756098 0.82317073 0.82882883 0.84023669 0.83431953] mean value: 0.8378690419889865 key: test_precision value: [0.75 0.76190476 0.53333333 0.73684211 0.65217391 0.77777778 0.82352941 0.66666667 0.6875 0.71428571] mean value: 0.7104013684039596 key: train_precision value: [0.83625731 0.85350318 0.84795322 0.83435583 0.84146341 0.85276074 0.82822086 0.82142857 0.8255814 0.81976744] mean value: 0.8361291957614069 key: test_recall value: [0.66666667 0.88888889 0.44444444 0.77777778 0.78947368 0.73684211 0.73684211 0.52631579 0.61111111 0.83333333] mean value: 0.7011695906432749 key: train_recall value: [0.86144578 0.80722892 0.87349398 0.81927711 0.83636364 0.84242424 0.81818182 0.83636364 0.85542169 0.84939759] mean value: 0.8399598393574297 key: test_roc_auc value: [0.72807018 0.8128655 0.5380117 0.75730994 0.67251462 0.75730994 0.78508772 0.62426901 0.66666667 0.75 ] mean value: 0.7092105263157895 key: train_roc_auc value: [0.84587441 0.83391749 0.85795911 0.82782037 0.83986857 0.84892296 0.82475356 0.82782037 0.8373494 0.8313253 ] mean value: 0.837561153705732 key: test_jcc value: [0.54545455 0.69565217 0.32 0.60869565 0.55555556 0.60869565 0.63636364 0.41666667 0.47826087 0.625 ] mean value: 0.5490344751866492 key: train_jcc value: [0.7371134 0.70899471 0.75520833 0.70466321 0.72251309 0.73544974 0.69948187 0.70769231 0.7244898 0.71573604] mean value: 0.7211342490784889 MCC on Blind test: 0.5 Accuracy on Blind test: 0.75 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01921725 0.01870728 0.01728058 0.01826978 0.01832366 0.01756644 0.01772666 0.02263999 0.01768255 0.01821613] mean value: 0.018563032150268555 key: score_time value: [0.01221204 0.01207209 0.01167274 0.01201773 0.01157498 0.01153445 0.01147413 0.01173925 0.01177478 0.01164174] mean value: 0.011771392822265626 key: test_mcc value: [0.94736842 0.89181287 0.63129316 0.73099415 0.78362573 0.73099415 0.7888597 0.89181287 0.50709255 0.88888889] mean value: 0.7792742482712791 key: train_mcc value: [0.81269853 0.80669661 0.84296615 0.81269853 0.80062066 0.81283091 0.79462558 0.80074488 0.82543601 0.78313253] mean value: 0.8092450407103096 key: test_accuracy value: [0.97297297 0.94594595 0.81081081 0.86486486 0.89189189 0.86486486 0.89189189 0.94594595 0.75 0.94444444] mean value: 0.8883633633633634 key: train_accuracy value: [0.90634441 0.90332326 0.92145015 0.90634441 0.90030211 0.90634441 0.89728097 0.90030211 0.9126506 0.89156627] mean value: 0.9045908710370182 key: test_fscore value: [0.97297297 0.94444444 0.82051282 0.86486486 0.89473684 0.86486486 0.88888889 0.94736842 0.72727273 0.94444444] mean value: 0.8870371291423923 key: train_fscore value: [0.90690691 0.90419162 0.92121212 0.90690691 0.90030211 0.90690691 0.89759036 0.9009009 0.91343284 0.89156627] mean value: 0.9049916936730755 key: test_precision value: [0.94736842 0.94444444 0.76190476 0.84210526 0.89473684 0.88888889 0.94117647 0.94736842 0.8 0.94444444] mean value: 0.8912437957639195 key: train_precision value: [0.90419162 0.89880952 0.92682927 0.90419162 0.89759036 0.89880952 0.89221557 0.89285714 0.90532544 0.89156627] mean value: 0.901238633145709 key: test_recall value: [1. 0.94444444 0.88888889 0.88888889 0.89473684 0.84210526 0.84210526 0.94736842 0.66666667 0.94444444] mean value: 0.8859649122807017 key: train_recall value: [0.90963855 0.90963855 0.91566265 0.90963855 0.9030303 0.91515152 0.9030303 0.90909091 0.92168675 0.89156627] mean value: 0.9088134355604235 key: test_roc_auc value: [0.97368421 0.94590643 0.8128655 0.86549708 0.89181287 0.86549708 0.89327485 0.94590643 0.75 0.94444444] mean value: 0.8888888888888888 key: train_roc_auc value: [0.90633443 0.90330413 0.92146769 0.90633443 0.90031033 0.90637094 0.89729828 0.90032859 0.9126506 0.89156627] mean value: 0.904596568090544 key: test_jcc value: [0.94736842 0.89473684 0.69565217 0.76190476 0.80952381 0.76190476 0.8 0.9 0.57142857 0.89473684] mean value: 0.8037256183938106 key: train_jcc value: [0.82967033 0.82513661 0.85393258 0.82967033 0.81868132 0.82967033 0.81420765 0.81967213 0.84065934 0.80434783] mean value: 0.8265648452150891 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.30559778 1.49760962 1.48370218 1.27121735 1.36745763 1.29357028 1.46157598 1.41085505 1.2677567 1.47282529] mean value: 1.3832167863845826 key: score_time value: [0.01324964 0.01464891 0.01291299 0.01481247 0.01527667 0.01298475 0.01542163 0.01258349 0.01508832 0.01303482] mean value: 0.01400136947631836 key: test_mcc value: [0.89736456 0.73099415 0.56725146 0.83918129 0.83871328 0.7888597 0.84959079 0.83918129 0.72333935 0.9459053 ] mean value: 0.8020381173135782 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94594595 0.86486486 0.78378378 0.91891892 0.91891892 0.89189189 0.91891892 0.91891892 0.86111111 0.97222222] mean value: 0.8995495495495496 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.86486486 0.77777778 0.91891892 0.92307692 0.88888889 0.91428571 0.91891892 0.85714286 0.97297297] mean value: 0.8984216257900468 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9 0.84210526 0.77777778 0.89473684 0.9 0.94117647 1. 0.94444444 0.88235294 0.94736842] mean value: 0.9029962160302718 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.88888889 0.77777778 0.94444444 0.94736842 0.84210526 0.84210526 0.89473684 0.83333333 1. ] mean value: 0.8970760233918128 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.86549708 0.78362573 0.91959064 0.91812865 0.89327485 0.92105263 0.91959064 0.86111111 0.97222222] mean value: 0.9001461988304094 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.76190476 0.63636364 0.85 0.85714286 0.8 0.84210526 0.85 0.75 0.94736842] mean value: 0.8194884939621782 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02773237 0.02290416 0.01795602 0.01657343 0.01584625 0.01707602 0.0161159 0.01845789 0.01577401 0.01874471] mean value: 0.0187180757522583 key: score_time value: [0.01235223 0.01075268 0.0102129 0.00925064 0.00990987 0.00948524 0.00892353 0.00983405 0.00992322 0.00996733] mean value: 0.010061168670654297 key: test_mcc value: [0.78362573 0.7888597 0.84834956 0.94736842 0.74044197 0.83918129 0.94736842 0.89736456 0.72333935 1. ] mean value: 0.8515898998369181 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.89189189 0.89189189 0.91891892 0.97297297 0.86486486 0.91891892 0.97297297 0.94594595 0.86111111 1. ] mean value: 0.923948948948949 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.89473684 0.90909091 0.97297297 0.85714286 0.91891892 0.97297297 0.94444444 0.86486486 1. ] mean value: 0.9224033671402092 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.88888889 0.85 1. 0.94736842 0.9375 0.94444444 1. 1. 0.84210526 1. ] mean value: 0.941030701754386 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.88888889 0.94444444 0.83333333 1. 0.78947368 0.89473684 0.94736842 0.89473684 0.88888889 1. ] mean value: 0.908187134502924 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.89181287 0.89327485 0.91666667 0.97368421 0.86695906 0.91959064 0.97368421 0.94736842 0.86111111 1. ] mean value: 0.9244152046783626 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.80952381 0.83333333 0.94736842 0.75 0.85 0.94736842 0.89473684 0.76190476 1. ] mean value: 0.8594235588972431 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.9 Accuracy on Blind test: 0.95 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.11874175 0.11397886 0.12404823 0.11526704 0.1260879 0.13378692 0.11424398 0.10551548 0.10430574 0.10480857] mean value: 0.11607844829559326 key: score_time value: [0.01930761 0.01774311 0.02675867 0.02725172 0.01928139 0.02496815 0.01777816 0.01777339 0.01758528 0.01737881] mean value: 0.02058262825012207 key: test_mcc value: [0.89736456 0.78362573 0.51461988 0.7888597 0.83871328 0.6754386 0.7888597 0.75938069 0.61977979 0.83462233] mean value: 0.7501264258080947 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94594595 0.89189189 0.75675676 0.89189189 0.91891892 0.83783784 0.89189189 0.86486486 0.80555556 0.91666667] mean value: 0.8722222222222222 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.88888889 0.75675676 0.89473684 0.92307692 0.84210526 0.88888889 0.84848485 0.78787879 0.91891892] mean value: 0.8697104539209802 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9 0.88888889 0.73684211 0.85 0.9 0.84210526 0.94117647 1. 0.86666667 0.89473684] mean value: 0.8820416236670107 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.88888889 0.77777778 0.94444444 0.94736842 0.84210526 0.84210526 0.73684211 0.72222222 0.94444444] mean value: 0.8646198830409356 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.89181287 0.75730994 0.89327485 0.91812865 0.8377193 0.89327485 0.86842105 0.80555556 0.91666667] mean value: 0.872953216374269 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.8 0.60869565 0.80952381 0.85714286 0.72727273 0.8 0.73684211 0.65 0.85 ] mean value: 0.7739477151376465 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00972319 0.00971651 0.00972152 0.00972605 0.00971293 0.00968099 0.00972724 0.00964713 0.00986457 0.00967932] mean value: 0.009719944000244141 key: score_time value: [0.00870895 0.00870967 0.00869751 0.00882626 0.00884151 0.00898957 0.0088141 0.00873256 0.00869012 0.00863934] mean value: 0.008764958381652832 key: test_mcc value: [0.56725146 0.62280702 0.18768409 0.74044197 0.57184997 0.30384671 0.83918129 0.56725146 0.52048344 0.63614643] mean value: 0.555694382971283 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.78378378 0.81081081 0.59459459 0.86486486 0.78378378 0.64864865 0.91891892 0.78378378 0.75 0.80555556] mean value: 0.7744744744744745 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.77777778 0.81081081 0.57142857 0.87179487 0.77777778 0.62857143 0.91891892 0.78947368 0.70967742 0.77419355] mean value: 0.7630424809032619 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.77777778 0.78947368 0.58823529 0.80952381 0.82352941 0.6875 0.94444444 0.78947368 0.84615385 0.92307692] mean value: 0.7979188875280206 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.77777778 0.83333333 0.55555556 0.94444444 0.73684211 0.57894737 0.89473684 0.78947368 0.61111111 0.66666667] mean value: 0.7388888888888889 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.78362573 0.81140351 0.59356725 0.86695906 0.78508772 0.6505848 0.91959064 0.78362573 0.75 0.80555556] mean value: 0.775 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.63636364 0.68181818 0.4 0.77272727 0.63636364 0.45833333 0.85 0.65217391 0.55 0.63157895] mean value: 0.626935892101796 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.56 Accuracy on Blind test: 0.78 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.51501703 1.47666764 1.49276137 1.48433304 1.4902401 1.48782825 1.50618243 1.49330306 1.57110786 1.59888744] mean value: 1.5116328239440917 key: score_time value: [0.09284711 0.09556246 0.09648204 0.09837961 0.09728193 0.09940147 0.0990901 0.0914495 0.09889078 0.09455132] mean value: 0.09639363288879395 key: test_mcc value: [0.94736842 0.89736456 0.89181287 0.94736842 0.89181287 0.89736456 0.94736842 0.94736842 0.72333935 0.9459053 ] mean value: 0.9037073192277006 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97297297 0.94594595 0.94594595 0.97297297 0.94594595 0.94594595 0.97297297 0.97297297 0.86111111 0.97222222] mean value: 0.950900900900901 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97297297 0.94736842 0.94444444 0.97297297 0.94736842 0.94444444 0.97297297 0.97297297 0.85714286 0.97297297] mean value: 0.9505633453001874 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94736842 0.9 0.94444444 0.94736842 0.94736842 1. 1. 1. 0.88235294 0.94736842] mean value: 0.9516271069831441 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.94444444 1. 0.94736842 0.89473684 0.94736842 0.94736842 0.83333333 1. ] mean value: 0.9514619883040936 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.94590643 0.97368421 0.94590643 0.94736842 0.97368421 0.97368421 0.86111111 0.97222222] mean value: 0.9514619883040936 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94736842 0.9 0.89473684 0.94736842 0.9 0.89473684 0.94736842 0.94736842 0.75 0.94736842] mean value: 0.9076315789473683 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.91 Accuracy on Blind test: 0.96 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.89117813 0.91720891 0.93513155 0.87951016 0.92528319 0.92419934 0.97259903 1.01499391 0.9730444 0.98030806] mean value: 0.9413456678390503 key: score_time value: [0.21520376 0.24883294 0.16850185 0.21201348 0.27353764 0.13810372 0.15924072 0.23260546 0.27458358 0.19869089] mean value: 0.21213140487670898 key: test_mcc value: [0.89181287 0.83918129 0.83918129 0.89736456 0.89181287 0.84959079 0.89736456 1. 0.72333935 0.9459053 ] mean value: 0.8775552871651642 key: train_mcc value: [0.96381759 0.95786323 0.94563709 0.96994925 0.96381495 0.95785863 0.96381495 0.95785863 0.96385542 0.95784871] mean value: 0.9602318452069324 key: test_accuracy value: [0.94594595 0.91891892 0.91891892 0.94594595 0.94594595 0.91891892 0.94594595 1. 0.86111111 0.97222222] mean value: 0.9373873873873874 key: train_accuracy value: [0.98187311 0.97885196 0.97280967 0.98489426 0.98187311 0.97885196 0.98187311 0.97885196 0.98192771 0.97891566] mean value: 0.9800722527572525 key: test_fscore value: [0.94444444 0.91891892 0.91891892 0.94736842 0.94736842 0.91428571 0.94444444 1. 0.85714286 0.97297297] mean value: 0.9365865113233535 key: train_fscore value: [0.98181818 0.9787234 0.97280967 0.98480243 0.98170732 0.97859327 0.98170732 0.97859327 0.98192771 0.97885196] mean value: 0.9799534538436605 key: test_precision value: [0.94444444 0.89473684 0.89473684 0.9 0.94736842 1. 1. 1. 0.88235294 0.94736842] mean value: 0.9411007911936704/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: train_precision value: [0.98780488 0.98773006 0.97575758 0.99386503 0.98773006 0.98765432 0.98773006 0.98765432 0.98192771 0.98181818] mean value: 0.9859672203167147 key: test_recall value: [0.94444444 0.94444444 0.94444444 1. 0.94736842 0.84210526 0.89473684 1. 0.83333333 1. ] mean value: 0.9350877192982456 key: train_recall value: [0.97590361 0.96987952 0.96987952 0.97590361 0.97575758 0.96969697 0.97575758 0.96969697 0.98192771 0.97590361] mean value: 0.9740306681270536 key: test_roc_auc value: [0.94590643 0.91959064 0.91959064 0.94736842 0.94590643 0.92105263 0.94736842 1. 0.86111111 0.97222222] mean value: 0.9380116959064327 key: train_roc_auc value: [0.9818912 0.97887915 0.97281855 0.9849215 0.98185469 0.97882439 0.98185469 0.97882439 0.98192771 0.97891566] mean value: 0.9800711938663746 key: test_jcc value: [0.89473684 0.85 0.85 0.9 0.9 0.84210526 0.89473684 1. 0.75 0.94736842] mean value: 0.8828947368421053 key: train_jcc value: [0.96428571 0.95833333 0.94705882 0.97005988 0.96407186 0.95808383 0.96407186 0.95808383 0.96449704 0.95857988] mean value: 0.9607126051710413 MCC on Blind test: 0.86 Accuracy on Blind test: 0.93 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02643442 0.01237798 0.00983572 0.01009274 0.00978518 0.0098691 0.00975704 0.00971055 0.01049852 0.00966072] mean value: 0.011802196502685547 key: score_time value: [0.01092529 0.00903606 0.00895619 0.00881696 0.00897527 0.008919 0.00883269 0.00885367 0.008847 0.00878906] mean value: 0.009095120429992675 key: test_mcc value: [0.74044197 0.62280702 0.4633451 0.57184997 0.56934383 0.62170355 0.7888597 0.6754386 0.4472136 0.78262379] mean value: 0.628362712085414 key: train_mcc value: [0.73459045 0.71601738 0.79022336 0.74626648 0.74713145 0.68655466 0.70405667 0.72205184 0.77108434 0.76674551] mean value: 0.7384722135344838 key: test_accuracy value: [0.86486486 0.81081081 0.72972973 0.78378378 0.78378378 0.81081081 0.89189189 0.83783784 0.72222222 0.88888889] mean value: 0.8124624624624625 key: train_accuracy value: [0.86706949 0.85800604 0.89425982 0.87311178 0.87311178 0.8429003 0.85196375 0.86102719 0.88554217 0.88253012] mean value: 0.8689522440214028 key: test_fscore value: [0.87179487 0.81081081 0.73684211 0.78947368 0.8 0.82051282 0.88888889 0.84210526 0.70588235 0.88235294] mean value: 0.8148663738756617 key: train_fscore value: [0.86982249 0.85885886 0.89795918 0.8742515 0.87573964 0.83850932 0.85285285 0.86060606 0.88554217 0.88629738] mean value: 0.8700439444712924 key: test_precision value: [0.80952381 0.78947368 0.7 0.75 0.76190476 0.8 0.94117647 0.84210526 0.75 0.9375 ] mean value: 0.8081683989385228 key: train_precision value: [0.85465116 0.85628743 0.8700565 0.86904762 0.85549133 0.85987261 0.8452381 0.86060606 0.88554217 0.85875706] mean value: 0.8615550031773642 key: test_recall value: [0.94444444 0.83333333 0.77777778 0.83333333 0.84210526 0.84210526 0.84210526 0.84210526 0.66666667 0.83333333] mean value: 0.8257309941520468 key: train_recall value: [0.88554217 0.86144578 0.92771084 0.87951807 0.8969697 0.81818182 0.86060606 0.86060606 0.88554217 0.91566265] mean value: 0.8791785323110625 key: test_roc_auc value: [0.86695906 0.81140351 0.73099415 0.78508772 0.78216374 0.80994152 0.89327485 0.8377193 0.72222222 0.88888889] mean value: 0.8128654970760234 key: train_roc_auc value: [0.86701351 0.85799562 0.89415845 0.87309237 0.87318364 0.84282585 0.85198978 0.86102592 0.88554217 0.88253012] mean value: 0.8689357429718876 key: test_jcc value: [0.77272727 0.68181818 0.58333333 0.65217391 0.66666667 0.69565217 0.8 0.72727273 0.54545455 0.78947368] mean value: 0.6914572498439775 key: train_jcc value: [0.76963351 0.75263158 0.81481481 0.77659574 0.77894737 0.72192513 0.7434555 0.75531915 0.79459459 0.79581152] mean value: 0.77037289076449 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.09763837 0.05395174 0.05793595 0.10008216 0.05730486 0.0574739 0.05634022 0.06836796 0.07902384 0.05754972] mean value: 0.06856687068939209 key: score_time value: [0.01159644 0.01121736 0.01094055 0.01107693 0.01040983 0.01041222 0.01046443 0.01227784 0.01092386 0.0105648 ] mean value: 0.010988426208496094 key: test_mcc value: [0.94736842 0.89736456 0.94721815 0.94736842 0.94736842 0.89736456 0.94736842 1. 0.72333935 1. ] mean value: 0.9254760307581459 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97297297 0.94594595 0.97297297 0.97297297 0.97297297 0.94594595 0.97297297 1. 0.86111111 1. ] mean value: 0.9617867867867869 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97297297 0.94736842 0.97142857 0.97297297 0.97297297 0.94444444 0.97297297 1. 0.86486486 1. ] mean value: 0.9619998193682404 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94736842 0.9 1. 0.94736842 1. 1. 1. 1. 0.84210526 1. ] mean value: 0.9636842105263158 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.94444444 1. 0.94736842 0.89473684 0.94736842 1. 0.88888889 1. ] mean value: 0.962280701754386 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.97222222 0.97368421 0.97368421 0.94736842 0.97368421 1. 0.86111111 1. ] mean value: 0.962280701754386 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94736842 0.9 0.94444444 0.94736842 0.94736842 0.89473684 0.94736842 1. 0.76190476 1. ] mean value: 0.9290559732664996 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.03597736 0.07507443 0.06862307 0.07008171 0.03527355 0.04518008 0.06671071 0.0492239 0.0749712 0.06974387] mean value: 0.05908598899841309 key: score_time value: [0.02051544 0.02273822 0.02037239 0.01225233 0.01217675 0.02001953 0.01223707 0.02085972 0.02115345 0.01531434] mean value: 0.01776392459869385 key: test_mcc value: [0.7888597 0.78764146 0.51461988 0.56725146 0.78362573 0.62280702 0.83918129 0.73020842 0.55901699 0.89442719] mean value: 0.7087639142980873 key: train_mcc value: [0.9577218 0.95786323 0.9577218 0.93957649 0.95772025 0.93355239 0.94563511 0.94563511 0.95180723 0.95208368] mean value: 0.9499317074650987 key: test_accuracy value: [0.89189189 0.89189189 0.75675676 0.78378378 0.89189189 0.81081081 0.91891892 0.86486486 0.77777778 0.94444444] mean value: 0.8533033033033033 key: train_accuracy value: [0.97885196 0.97885196 0.97885196 0.96978852 0.97885196 0.96676737 0.97280967 0.97280967 0.97590361 0.97590361] mean value: 0.974939031048666 key: test_fscore value: [0.89473684 0.88235294 0.75675676 0.77777778 0.89473684 0.81081081 0.91891892 0.87179487 0.76470588 0.94736842] mean value: 0.8519960064851706 key: train_fscore value: [0.97885196 0.9787234 0.97885196 0.96987952 0.9787234 0.96676737 0.97264438 0.97264438 0.97590361 0.97560976] mean value: 0.9748599750031368 key: test_precision value: [0.85 0.9375 0.73684211 0.77777778 0.89473684 0.83333333 0.94444444 0.85 0.8125 0.9 ] mean value: 0.8537134502923976 key: train_precision value: [0.98181818 0.98773006 0.98181818 0.96987952 0.98170732 0.96385542 0.97560976 0.97560976 0.97590361 0.98765432] mean value: 0.9781586129458871 key: test_recall value: [0.94444444 0.83333333 0.77777778 0.77777778 0.89473684 0.78947368 0.89473684 0.89473684 0.72222222 1. ] mean value: 0.8529239766081871 key: train_recall value: [0.97590361 0.96987952 0.97590361 0.96987952 0.97575758 0.96969697 0.96969697 0.96969697 0.97590361 0.96385542] mean value: 0.9716173786053304 key: test_roc_auc value: [0.89327485 0.89035088 0.75730994 0.78362573 0.89181287 0.81140351 0.91959064 0.86403509 0.77777778 0.94444444] mean value: 0.8533625730994152 key: train_roc_auc value: [0.9788609 0.97887915 0.9788609 0.96978824 0.97884264 0.9667762 0.97280029 0.97280029 0.97590361 0.97590361] mean value: 0.9749415845198979 key: test_jcc value: [0.80952381 0.78947368 0.60869565 0.63636364 0.80952381 0.68181818 0.85 0.77272727 0.61904762 0.9 ] mean value: 0.7477173665388769 key: train_jcc value: [0.95857988 0.95833333 0.95857988 0.94152047 0.95833333 0.93567251 0.94674556 0.94674556 0.95294118 0.95238095] mean value: 0.9509832665548312 MCC on Blind test: 0.73 Accuracy on Blind test: 0.86 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02032638 0.01103067 0.01079988 0.01054168 0.01051974 0.01051974 0.0105474 0.01089859 0.0105722 0.01062846] mean value: 0.011638474464416505 key: score_time value: [0.01018023 0.00987649 0.00954866 0.00943232 0.00947976 0.00946927 0.00955534 0.00968814 0.00956678 0.00959611] mean value: 0.009639310836791991 key: test_mcc value: [0.73020842 0.29824561 0.57184997 0.73099415 0.62280702 0.40469382 0.7888597 0.57184997 0.3721042 0.89442719] mean value: 0.5986040048640627 key: train_mcc value: [0.65133406 0.56567532 0.71002957 0.70459299 0.69830851 0.60223279 0.68655466 0.63906236 0.67014765 0.73493976] mean value: 0.6662877681119577 key: test_accuracy value: [0.86486486 0.64864865 0.78378378 0.86486486 0.81081081 0.7027027 0.89189189 0.78378378 0.66666667 0.94444444] mean value: 0.7962462462462463 key: train_accuracy value: [0.82477341 0.78247734 0.85498489 0.85196375 0.8489426 0.80060423 0.8429003 0.81873112 0.83433735 0.86746988] mean value: 0.8327184872420195 key: test_fscore value: [0.85714286 0.64864865 0.78947368 0.86486486 0.81081081 0.71794872 0.88888889 0.77777778 0.57142857 0.94117647] mean value: 0.7868161292309899 key: train_fscore value: [0.81875 0.77777778 0.85454545 0.84923077 0.84567901 0.79375 0.83850932 0.81132075 0.82866044 0.86746988] mean value: 0.8285693401041992 key: test_precision value: [0.88235294 0.63157895 0.75 0.84210526 0.83333333 0.7 0.94117647 0.82352941 0.8 1. ] mean value: 0.820407636738906 key: train_precision value: [0.85064935 0.79746835 0.8597561 0.86792453 0.86163522 0.81935484 0.85987261 0.84313725 0.85806452 0.86746988] mean value: 0.848533265179209 key: test_recall value: [0.83333333 0.66666667 0.83333333 0.88888889 0.78947368 0.73684211 0.84210526 0.73684211 0.44444444 0.88888889] mean value: 0.7660818713450293 key: train_recall value: [0.78915663 0.75903614 0.84939759 0.8313253 0.83030303 0.76969697 0.81818182 0.78181818 0.80120482 0.86746988] mean value: 0.8097590361445783 key: test_roc_auc value: [0.86403509 0.64912281 0.78508772 0.86549708 0.81140351 0.70175439 0.89327485 0.78508772 0.66666667 0.94444444] mean value: 0.7966374269005848 key: train_roc_auc value: [0.82488134 0.78254838 0.85500183 0.85202629 0.84888645 0.80051114 0.84282585 0.81861993 0.83433735 0.86746988] mean value: 0.832710843373494 key: test_jcc value: [0.75 0.48 0.65217391 0.76190476 0.68181818 0.56 0.8 0.63636364 0.4 0.88888889] mean value: 0.6611149382018947 key: train_jcc value: [0.69312169 0.63636364 0.74603175 0.73796791 0.73262032 0.65803109 0.72192513 0.68253968 0.70744681 0.76595745] mean value: 0.7082005470442766 MCC on Blind test: 0.77 Accuracy on Blind test: 0.89 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01300788 0.01797628 0.01849413 0.01570034 0.01609373 0.02065372 0.02061057 0.01798749 0.01872349 0.01886225] mean value: 0.017810988426208495 key: score_time value: [0.00886083 0.01124477 0.0111165 0.01169562 0.01177621 0.0118463 0.01175761 0.0117023 0.01185799 0.01174355] mean value: 0.01136016845703125 key: test_mcc value: [0.80369958 0.51793973 0.56725146 0.73020842 0.78362573 0.78362573 0.84959079 0.83918129 0.79772404 0.73246702] mean value: 0.7405313784220269 key: train_mcc value: [0.85241016 0.90101455 0.93492806 0.76178654 0.87390869 0.91019063 0.93436201 0.89441747 0.82781591 0.90978714] mean value: 0.8800621170623023 key: test_accuracy value: [0.89189189 0.75675676 0.78378378 0.86486486 0.89189189 0.89189189 0.91891892 0.91891892 0.88888889 0.86111111] mean value: 0.8668918918918919 key: train_accuracy value: [0.9244713 0.94864048 0.96676737 0.87009063 0.93655589 0.95468278 0.96676737 0.94561934 0.90662651 0.95481928] mean value: 0.9375040949295672 key: test_fscore value: [0.9 0.72727273 0.77777778 0.85714286 0.89473684 0.89473684 0.91428571 0.91891892 0.875 0.87179487] mean value: 0.8631666551403394 key: train_fscore value: [0.92795389 0.94637224 0.96594427 0.85324232 0.93768546 0.95548961 0.96594427 0.94303797 0.89700997 0.95440729] mean value: 0.9347087306426057 key: test_precision value: [0.81818182 0.8 0.77777778 0.88235294 0.89473684 0.89473684 1. 0.94444444 1. 0.80952381] mean value: 0.8821754475314847 key: train_precision value: [0.88950276 0.99337748 0.99363057 0.98425197 0.91860465 0.93604651 0.98734177 0.98675497 1. 0.96319018] mean value: 0.9652700873506086 key: test_recall value: [1. 0.66666667 0.77777778 0.83333333 0.89473684 0.89473684 0.84210526 0.89473684 0.77777778 0.94444444] mean value: 0.8526315789473684 key: train_recall value: [0.96987952 0.90361446 0.93975904 0.75301205 0.95757576 0.97575758 0.94545455 0.9030303 0.81325301 0.94578313] mean value: 0.9107119386637459 key: test_roc_auc value: [0.89473684 0.75438596 0.78362573 0.86403509 0.89181287 0.89181287 0.92105263 0.91959064 0.88888889 0.86111111] mean value: 0.8671052631578947 key: train_roc_auc value: [0.9243337 0.94877693 0.96684922 0.87044542 0.9366192 0.95474626 0.96670318 0.94549106 0.90662651 0.95481928] mean value: 0.9375410733844468 key: test_jcc value: [0.81818182 0.57142857 0.63636364 0.75 0.80952381 0.80952381 0.84210526 0.85 0.77777778 0.77272727] mean value: 0.763763195868459 key: train_jcc value: [0.8655914 0.89820359 0.93413174 0.74404762 0.88268156 0.91477273 0.93413174 0.89221557 0.81325301 0.9127907 ] mean value: 0.8791819652868769 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01939464 0.01617098 0.01889229 0.01758027 0.01637602 0.01791692 0.01738191 0.03996897 0.0171361 0.01589084] mean value: 0.019670891761779784 key: score_time value: [0.01182365 0.01187801 0.01180458 0.01176167 0.01184773 0.02823496 0.01288629 0.01227355 0.01188588 0.01173544] mean value: 0.013613176345825196 key: test_mcc value: [0.7163504 0.51478965 0.62280702 0.51121719 0.78362573 0.84959079 0.59234888 0.78362573 0.78262379 0.78262379] mean value: 0.693960296854044 key: train_mcc value: [0.84650478 0.75381029 0.95278173 0.57559806 0.91540708 0.92917693 0.63638198 0.90359336 0.91893658 0.87618491] mean value: 0.8308375712756191 key: test_accuracy value: [0.83783784 0.72972973 0.81081081 0.7027027 0.89189189 0.91891892 0.75675676 0.89189189 0.88888889 0.88888889] mean value: 0.8318318318318318 key: train_accuracy value: [0.918429 0.86404834 0.97583082 0.74924471 0.95770393 0.96374622 0.78851964 0.95166163 0.95783133 0.93674699] mean value: 0.9063762603283223 key: test_fscore value: [0.85714286 0.77272727 0.81081081 0.76595745 0.89473684 0.91428571 0.68965517 0.89473684 0.88235294 0.89473684] mean value: 0.8377142741681218 key: train_fscore value: [0.92436975 0.88 0.97530864 0.8 0.95757576 0.9625 0.73076923 0.95209581 0.95597484 0.93913043] mean value: 0.9077724464152594 key: test_precision value: [0.75 0.65384615 0.78947368 0.62068966 0.89473684 1. 1. 0.89473684 0.9375 0.85 ] mean value: 0.839098317743962 key: train_precision value: [0.86387435 0.78947368 1. 0.66666667 0.95757576 0.99354839 1. 0.9408284 1. 0.90502793] mean value: 0.9116995176427221 key: test_recall value: [1. 0.94444444 0.83333333 1. 0.89473684 0.84210526 0.52631579 0.89473684 0.83333333 0.94444444] mean value: 0.8713450292397661 key: train_recall value: [0.9939759 0.9939759 0.95180723 1. 0.95757576 0.93333333 0.57575758 0.96363636 0.91566265 0.97590361] mean value: 0.926162833150785 key: test_roc_auc value: [0.84210526 0.73538012 0.81140351 0.71052632 0.89181287 0.92105263 0.76315789 0.89181287 0.88888889 0.88888889] mean value: 0.8345029239766082 key: train_roc_auc value: [0.91820007 0.86365462 0.97590361 0.74848485 0.95770354 0.96365462 0.78787879 0.9516977 0.95783133 0.93674699] mean value: 0.9061756115370574 key: test_jcc value: [0.75 0.62962963 0.68181818 0.62068966 0.80952381 0.84210526 0.52631579 0.80952381 0.78947368 0.80952381] mean value: 0.7268603632033759 key: train_jcc value: [0.859375 0.78571429 0.95180723 0.66666667 0.91860465 0.92771084 0.57575758 0.90857143 0.91566265 0.8852459 ] mean value: 0.8395116232403658 MCC on Blind test: 0.77 Accuracy on Blind test: 0.88 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.15993023 0.14477468 0.14659452 0.14503288 0.14535022 0.14776993 0.14721227 0.14613962 0.14871216 0.1482687 ] mean value: 0.1479785203933716 key: score_time value: [0.01520491 0.01517415 0.0165267 0.01517439 0.01506782 0.01651335 0.01520538 0.01570082 0.01625323 0.01571679] mean value: 0.01565375328063965 key: test_mcc value: [1. 0.83918129 0.94721815 0.94736842 0.94736842 0.94736842 0.94736842 0.94736842 0.78262379 1. ] mean value: 0.9305865333163313 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.91891892 0.97297297 0.97297297 0.97297297 0.97297297 0.97297297 0.97297297 0.88888889 1. ] mean value: 0.9645645645645646 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.91891892 0.97142857 0.97297297 0.97297297 0.97297297 0.97297297 0.97297297 0.89473684 1. ] mean value: 0.9649949197317619 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.89473684 1. 0.94736842 1. 1. 1. 1. 0.85 1. ] mean value: 0.9692105263157895 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.94444444 0.94444444 1. 0.94736842 0.94736842 0.94736842 0.94736842 0.94444444 1. ] mean value: 0.962280701754386 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.91959064 0.97222222 0.97368421 0.97368421 0.97368421 0.97368421 0.97368421 0.88888889 1. ] mean value: 0.9649122807017544 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.85 0.94444444 0.94736842 0.94736842 0.94736842 0.94736842 0.94736842 0.80952381 1. ] mean value: 0.9340810359231412 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.04762053 0.05938816 0.05389047 0.07260513 0.06046724 0.05754685 0.05538869 0.05734062 0.06089234 0.06812143] mean value: 0.05932614803314209 key: score_time value: [0.01756454 0.02058482 0.03347611 0.03177166 0.02301621 0.04879737 0.0274539 0.0281961 0.03790355 0.03870583] mean value: 0.030747008323669434 key: test_mcc value: [0.94736842 0.84959079 1. 0.94736842 0.89181287 0.89736456 0.94736842 0.89736456 0.83462233 1. ] mean value: 0.9212860370101059 key: train_mcc value: [0.9939759 0.9818912 0.98203528 0.9939759 0.9879153 0.98189054 0.98189054 0.9879153 0.99399394 0.98194553] mean value: 0.9867429430855921 key: test_accuracy value: [0.97297297 0.91891892 1. 0.97297297 0.94594595 0.94594595 0.97297297 0.94594595 0.91666667 1. ] mean value: 0.9592342342342343 key: train_accuracy value: [0.99697885 0.99093656 0.99093656 0.99697885 0.9939577 0.99093656 0.99093656 0.9939577 0.99698795 0.99096386] mean value: 0.9933571142576347 key: test_fscore value: [0.97297297 0.92307692 1. 0.97297297 0.94736842 0.94444444 0.97297297 0.94444444 0.91891892 1. ] mean value: 0.9597172070856281 key: train_fscore value: [0.99697885 0.99093656 0.99088146 0.99697885 0.99393939 0.99088146 0.99088146 0.99393939 0.99697885 0.99093656] mean value: 0.99333328324522 key: test_precision value: [0.94736842 0.85714286 1. 0.94736842 0.94736842 1. 1. 1. 0.89473684 1. ] mean value: 0.9593984962406015 key: train_precision value: [1. 0.99393939 1. 1. 0.99393939 0.99390244 0.99390244 0.99393939 1. 0.99393939] mean value: 0.9963562453806356 key: test_recall value: [1. 1. 1. 1. 0.94736842 0.89473684 0.94736842 0.89473684 0.94444444 1. ] mean value: 0.9628654970760234 key: train_recall value: [0.9939759 0.98795181 0.98192771 0.9939759 0.99393939 0.98787879 0.98787879 0.99393939 0.9939759 0.98795181] mean value: 0.9903395399780942 key: test_roc_auc value: [0.97368421 0.92105263 1. 0.97368421 0.94590643 0.94736842 0.97368421 0.94736842 0.91666667 1. ] mean value: 0.9599415204678362 key: train_roc_auc value: [0.99698795 0.9909456 0.99096386 0.99698795 0.99395765 0.99092735 0.99092735 0.99395765 0.99698795 0.99096386] mean value: 0.9933607155896312 key: test_jcc value: [0.94736842 0.85714286 1. 0.94736842 0.9 0.89473684 0.94736842 0.89473684 0.85 1. ] mean value: 0.9238721804511278 key: train_jcc value: [0.9939759 0.98203593 0.98192771 0.9939759 0.98795181 0.98192771 0.98192771 0.98795181 0.9939759 0.98203593] mean value: 0.9867686314118751 MCC on Blind test: 0.9 Accuracy on Blind test: 0.95 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.07490635 0.11946273 0.103163 0.10210061 0.1119535 0.11135602 0.16045046 0.09157825 0.10720015 0.11908531] mean value: 0.11012563705444336 key: score_time value: [0.02171206 0.02200747 0.02642679 0.02162862 0.02213502 0.02532601 0.02199364 0.02181697 0.02197194 0.02154231] mean value: 0.02265608310699463 key: test_mcc value: [0.56725146 0.6754386 0.35104619 0.63129316 0.4633451 0.56725146 0.62280702 0.60308132 0.5007734 0.78262379] mean value: 0.5764911497325073 key: train_mcc value: [0.9939759 0.9939759 0.9939759 0.9939759 0.99397568 1. 1. 0.99397568 0.99399394 0.99399394] mean value: 0.9951842862455198 key: test_accuracy value: [0.78378378 0.83783784 0.67567568 0.81081081 0.72972973 0.78378378 0.81081081 0.78378378 0.75 0.88888889] mean value: 0.7855105105105105 key: train_accuracy value: [0.99697885 0.99697885 0.99697885 0.99697885 0.99697885 1. 1. 0.99697885 0.99698795 0.99698795] mean value: 0.9975849015396935 key: test_fscore value: [0.77777778 0.83333333 0.64705882 0.82051282 0.72222222 0.78947368 0.81081081 0.75 0.75675676 0.88235294] mean value: 0.779029917033013 key: train_fscore value: [0.99697885 0.99697885 0.99697885 0.99697885 0.99696049 1. 1. 0.99696049 0.99697885 0.99697885] mean value: 0.9975794084426854 key: test_precision value: [0.77777778 0.83333333 0.6875 0.76190476 0.76470588 0.78947368 0.83333333 0.92307692 0.73684211 0.9375 ] mean value: 0.8045447801252755 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.77777778 0.83333333 0.61111111 0.88888889 0.68421053 0.78947368 0.78947368 0.63157895 0.77777778 0.83333333] mean value: 0.7616959064327485 key: train_recall value: [0.9939759 0.9939759 0.9939759 0.9939759 0.99393939 1. 1. 0.99393939 0.9939759 0.9939759 ] mean value: 0.9951734209565535 key: test_roc_auc value: [0.78362573 0.8377193 0.67397661 0.8128655 0.73099415 0.78362573 0.81140351 0.7880117 0.75 0.88888889] mean value: 0.7861111111111111 key: train_roc_auc value: [0.99698795 0.99698795 0.99698795 0.99698795 0.9969697 1. 1. 0.9969697 0.99698795 0.99698795] mean value: 0.9975867104782767 key: test_jcc value: [0.63636364 0.71428571 0.47826087 0.69565217 0.56521739 0.65217391 0.68181818 0.6 0.60869565 0.78947368] mean value: 0.6421941216678059 key: train_jcc value: [0.9939759 0.9939759 0.9939759 0.9939759 0.99393939 1. 1. 0.99393939 0.9939759 0.9939759 ] mean value: 0.9951734209565535 MCC on Blind test: 0.54 Accuracy on Blind test: 0.77 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.56603932 0.55516958 0.56111884 0.5529871 0.55033922 0.55430651 0.53636289 0.54783416 0.5531528 0.55002046] mean value: 0.5527330875396729 key: score_time value: [0.01002455 0.00995398 0.00930357 0.00971341 0.01047468 0.00944734 0.00963187 0.00966716 0.00973773 0.00942183] mean value: 0.009737610816955566 key: test_mcc value: [0.89181287 0.7888597 1. 0.94736842 0.89181287 0.94736842 0.94736842 1. 0.78262379 1. ] mean value: 0.9197214483309992 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94594595 0.89189189 1. 0.97297297 0.94594595 0.97297297 0.97297297 1. 0.88888889 1. ] mean value: 0.9591591591591592 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94444444 0.89473684 1. 0.97297297 0.94736842 0.97297297 0.97297297 1. 0.89473684 1. ] mean value: 0.9600205468626521 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94444444 0.85 1. 0.94736842 0.94736842 1. 1. 1. 0.85 1. ] mean value: 0.9539181286549707 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94444444 0.94444444 1. 1. 0.94736842 0.94736842 0.94736842 1. 0.94444444 1. ] mean value: 0.9675438596491228 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94590643 0.89327485 1. 0.97368421 0.94590643 0.97368421 0.97368421 1. 0.88888889 1. ] mean value: 0.9595029239766082 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.89473684 0.80952381 1. 0.94736842 0.9 0.94736842 0.94736842 1. 0.80952381 1. ] mean value: 0.9255889724310777 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02545977 0.02662921 0.03351521 0.02726531 0.02694201 0.02765965 0.02748775 0.02727795 0.02772045 0.02956676] mean value: 0.027952408790588378 key: score_time value: [0.01267028 0.01618814 0.01552796 0.01530409 0.01815295 0.01537704 0.01538992 0.01563191 0.01558256 0.01613903] mean value: 0.015596389770507812 key: test_mcc value: [0.51319869 0.40643275 0.02932564 0.1378305 0.4163404 0.1378305 0.36315314 0.46019501 0.35355339 0.35355339] mean value: 0.3171413424046587 key: train_mcc value: [0.91872008 0.9939759 0.98203333 0.83781349 0.96374589 0.80665108 0.96437604 0.9818912 0.97618706 0.86450473] mean value: 0.9289898789692037 key: test_accuracy value: [0.75675676 0.7027027 0.51351351 0.56756757 0.7027027 0.56756757 0.67567568 0.72972973 0.66666667 0.66666667] mean value: 0.6549549549549549 key: train_accuracy value: [0.95770393 0.99697885 0.99093656 0.91238671 0.98187311 0.89425982 0.98187311 0.99093656 0.98795181 0.92771084] mean value: 0.9622611291085793 key: test_fscore value: [0.74285714 0.7027027 0.52631579 0.57894737 0.74418605 0.55555556 0.64705882 0.75 0.71428571 0.71428571] mean value: 0.6676194857622606 key: train_fscore value: [0.95597484 0.99697885 0.99104478 0.90429043 0.98181818 0.88135593 0.98148148 0.99093656 0.98809524 0.93258427] mean value: 0.96045605590458 key: test_precision value: [0.76470588 0.68421053 0.5 0.55 0.66666667 0.58823529 0.73333333 0.71428571 0.625 0.625 ] mean value: 0.6451437417072092 key: train_precision value: [1. 1. 0.98224852 1. 0.98181818 1. 1. 0.98795181 0.97647059 0.87368421] mean value: 0.9802173308518767 key: test_recall value: [0.72222222 0.72222222 0.55555556 0.61111111 0.84210526 0.52631579 0.57894737 0.78947368 0.83333333 0.83333333] mean value: 0.7014619883040936 key: train_recall value: [0.91566265 0.9939759 1. 0.8253012 0.98181818 0.78787879 0.96363636 0.99393939 1. 1. ] mean value: 0.9462212486308872 key: test_roc_auc value: [0.75584795 0.70321637 0.51461988 0.56871345 0.69883041 0.56871345 0.67836257 0.72807018 0.66666667 0.66666667] mean value: 0.6549707602339181 key: train_roc_auc value: [0.95783133 0.99698795 0.99090909 0.9126506 0.98187295 0.89393939 0.98181818 0.9909456 0.98795181 0.92771084] mean value: 0.9622617743702081 key: test_jcc value: [0.59090909 0.54166667 0.35714286 0.40740741 0.59259259 0.38461538 0.47826087 0.6 0.55555556 0.55555556] mean value: 0.5063705980010328 key: train_jcc value: [0.91566265 0.9939759 0.98224852 0.8253012 0.96428571 0.78787879 0.96363636 0.98203593 0.97647059 0.87368421] mean value: 0.9265179872452391 MCC on Blind test: 0.5 Accuracy on Blind test: 0.75 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02361274 0.05441642 0.03723788 0.04656839 0.03769898 0.03737879 0.04589653 0.03744817 0.03729868 0.03707409] mean value: 0.03946306705474854 key: score_time value: [0.02177238 0.02057743 0.02044559 0.02325034 0.01888037 0.02219844 0.0233705 0.0234704 0.0239017 0.02041125] mean value: 0.021827840805053712 key: test_mcc value: [0.94736842 0.73821295 0.56725146 0.73099415 0.78362573 0.73099415 0.84959079 0.89181287 0.72333935 0.83462233] mean value: 0.7797812196768581 key: train_mcc value: [0.88520939 0.89729828 0.91547702 0.87940108 0.91547085 0.89729828 0.89729828 0.88521358 0.89182522 0.89156627] mean value: 0.8956058253044522 key: test_accuracy value: [0.97297297 0.86486486 0.78378378 0.86486486 0.89189189 0.86486486 0.91891892 0.94594595 0.86111111 0.91666667] mean value: 0.8885885885885886 key: train_accuracy value: [0.94259819 0.94864048 0.95770393 0.93957704 0.95770393 0.94864048 0.94864048 0.94259819 0.94578313 0.94578313] mean value: 0.9477668984093474 key: test_fscore value: [0.97297297 0.84848485 0.77777778 0.86486486 0.89473684 0.86486486 0.91428571 0.94736842 0.85714286 0.91891892] mean value: 0.8861418082470714 key: train_fscore /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:176: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:179: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) value: [0.94294294 0.94864048 0.95757576 0.94047619 0.95731707 0.94864048 0.94864048 0.94259819 0.94642857 0.94578313] mean value: 0.947904330558655 key: test_precision value: [0.94736842 0.93333333 0.77777778 0.84210526 0.89473684 0.88888889 1. 0.94736842 0.88235294 0.89473684] mean value: 0.9008668730650154 key: train_precision value: [0.94011976 0.95151515 0.96341463 0.92941176 0.96319018 0.94578313 0.94578313 0.93975904 0.93529412 0.94578313] mean value: 0.9460054046277495 key: test_recall value: [1. 0.77777778 0.77777778 0.88888889 0.89473684 0.84210526 0.84210526 0.94736842 0.83333333 0.94444444] mean value: 0.8748538011695907 key: train_recall value: [0.94578313 0.94578313 0.95180723 0.95180723 0.95151515 0.95151515 0.95151515 0.94545455 0.95783133 0.94578313] mean value: 0.9498795180722892 key: test_roc_auc value: [0.97368421 0.8625731 0.78362573 0.86549708 0.89181287 0.86549708 0.92105263 0.94590643 0.86111111 0.91666667] mean value: 0.8887426900584795 key: train_roc_auc value: [0.94258854 0.94864914 0.9577218 0.93953998 0.95768529 0.94864914 0.94864914 0.94260679 0.94578313 0.94578313] mean value: 0.9477656078860899 key: test_jcc value: [0.94736842 0.73684211 0.63636364 0.76190476 0.80952381 0.76190476 0.84210526 0.9 0.75 0.85 ] mean value: 0.7996012759170654 key: train_jcc value: [0.89204545 0.90229885 0.91860465 0.88764045 0.91812865 0.90229885 0.90229885 0.89142857 0.89830508 0.89714286] mean value: 0.9010192275158537 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.25129724 0.32188392 0.33577132 0.31489706 0.26630807 0.47269154 0.30162644 0.28344464 0.26365685 0.33231711] mean value: 0.3143894195556641 key: score_time value: [0.01923299 0.02272201 0.02159452 0.02046084 0.01954365 0.02145052 0.02375412 0.02187276 0.01636934 0.02253842] mean value: 0.020953917503356935 key: test_mcc value: [0.94736842 0.73821295 0.56725146 0.73099415 0.78362573 0.73099415 0.84959079 0.89181287 0.72333935 0.83462233] mean value: 0.7797812196768581 key: train_mcc value: [0.88520939 0.89729828 0.91547702 0.87940108 0.91547085 0.89729828 0.89729828 0.80061339 0.89182522 0.89156627] mean value: 0.8871458056914586 key: test_accuracy value: [0.97297297 0.86486486 0.78378378 0.86486486 0.89189189 0.86486486 0.91891892 0.94594595 0.86111111 0.91666667] mean value: 0.8885885885885886 key: train_accuracy value: [0.94259819 0.94864048 0.95770393 0.93957704 0.95770393 0.94864048 0.94864048 0.90030211 0.94578313 0.94578313] mean value: 0.9435372911585921 key: test_fscore value: [0.97297297 0.84848485 0.77777778 0.86486486 0.89473684 0.86486486 0.91428571 0.94736842 0.85714286 0.91891892] mean value: 0.8861418082470714 key: train_fscore value: [0.94294294 0.94864048 0.95757576 0.94047619 0.95731707 0.94864048 0.94864048 0.89969605 0.94642857 0.94578313] mean value: 0.9436141166907591 key: test_precision value: [0.94736842 0.93333333 0.77777778 0.84210526 0.89473684 0.88888889 1. 0.94736842 0.88235294 0.89473684] mean value: 0.9008668730650154 key: train_precision value: [0.94011976 0.95151515 0.96341463 0.92941176 0.96319018 0.94578313 0.94578313 0.90243902 0.93529412 0.94578313] mean value: 0.9422734034523161 key: test_recall value: [1. 0.77777778 0.77777778 0.88888889 0.89473684 0.84210526 0.84210526 0.94736842 0.83333333 0.94444444] mean value: 0.8748538011695907 key: train_recall value: [0.94578313 0.94578313 0.95180723 0.95180723 0.95151515 0.95151515 0.95151515 0.8969697 0.95783133 0.94578313] mean value: 0.9450310332238043 key: test_roc_auc value: [0.97368421 0.8625731 0.78362573 0.86549708 0.89181287 0.86549708 0.92105263 0.94590643 0.86111111 0.91666667] mean value: 0.8887426900584795 key: train_roc_auc value: [0.94258854 0.94864914 0.9577218 0.93953998 0.95768529 0.94864914 0.94864914 0.90029208 0.94578313 0.94578313] mean value: 0.9435341365461848 key: test_jcc value: [0.94736842 0.73684211 0.63636364 0.76190476 0.80952381 0.76190476 0.84210526 0.9 0.75 0.85 ] mean value: 0.7996012759170654 key: train_jcc value: [0.89204545 0.90229885 0.91860465 0.88764045 0.91812865 0.90229885 0.90229885 0.81767956 0.89830508 0.89714286] mean value: 0.8936443261741015 MCC on Blind test: 0.79 Accuracy on Blind test: 0.9 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03955388 0.03381324 0.03329587 0.05509424 0.05626059 0.05854964 0.05459762 0.06159639 0.03439593 0.03141403] mean value: 0.04585714340209961 key: score_time value: [0.01206017 0.01217222 0.01196647 0.01265693 0.0123384 0.01210618 0.01433325 0.01214552 0.01429367 0.01195979] mean value: 0.012603259086608887 key: test_mcc value: [0.73786479 0.84327404 0.68421053 0.84327404 0.84327404 0.73786479 0.89973541 0.84327404 0.89736456 0.51319869] mean value: 0.7843334932834807 key: train_mcc value: [0.85906136 0.87648575 0.85888297 0.84119102 0.87660709 0.84705882 0.87058824 0.85888297 0.85929061 0.85930029] mean value: 0.8607349129613298 key: test_accuracy value: [0.86842105 0.92105263 0.84210526 0.92105263 0.92105263 0.86842105 0.94736842 0.92105263 0.94594595 0.75675676] mean value: 0.8913229018492176 key: train_accuracy value: [0.92941176 0.93823529 0.92941176 0.92058824 0.93823529 0.92352941 0.93529412 0.92941176 0.92961877 0.92961877] mean value: 0.9303355183715715 key: test_fscore value: [0.87179487 0.92307692 0.84210526 0.91891892 0.91891892 0.87179487 0.94444444 0.91891892 0.94736842 0.76923077] mean value: 0.8926572321309163 key: train_fscore value: [0.93023256 0.93841642 0.92899408 0.92035398 0.93877551 0.92352941 0.93529412 0.92982456 0.93023256 0.92982456] mean value: 0.9305477766130446 key: test_precision value: [0.85 0.9 0.84210526 0.94444444 0.94444444 0.85 1. 0.94444444 0.9 0.75 ] mean value: 0.8925438596491228 key: train_precision value: [0.91954023 0.93567251 0.93452381 0.92307692 0.93063584 0.92352941 0.93529412 0.9244186 0.92485549 0.9244186 ] mean value: 0.9275965545299533 key: test_recall value: [0.89473684 0.94736842 0.84210526 0.89473684 0.89473684 0.89473684 0.89473684 0.89473684 1. 0.78947368] mean value: 0.8947368421052632 key: train_recall value: [0.94117647 0.94117647 0.92352941 0.91764706 0.94705882 0.92352941 0.93529412 0.93529412 0.93567251 0.93529412] mean value: 0.9335672514619883 key: test_roc_auc value: [0.86842105 0.92105263 0.84210526 0.92105263 0.92105263 0.86842105 0.94736842 0.92105263 0.94736842 0.75584795] mean value: 0.8913742690058479 key: train_roc_auc value: [0.92941176 0.93823529 0.92941176 0.92058824 0.93823529 0.92352941 0.93529412 0.92941176 0.92960096 0.92963536] mean value: 0.9303353973168215 key: test_jcc value: [0.77272727 0.85714286 0.72727273 0.85 0.85 0.77272727 0.89473684 0.85 0.9 0.625 ] mean value: 0.8099606971975393 key: train_jcc value: [0.86956522 0.8839779 0.86740331 0.85245902 0.88461538 0.8579235 0.87845304 0.86885246 0.86956522 0.86885246] mean value: 0.8701667505235628 MCC on Blind test: 0.83 Accuracy on Blind test: 0.91 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.85755229 0.90421176 1.06443739 0.84898424 1.10972214 1.86754012 1.32581997 1.09845734 1.02783036 1.25375056] mean value: 1.1358306169509889 key: score_time value: [0.01481295 0.01703072 0.01232839 0.02471399 0.01229548 0.01258135 0.01538014 0.01235461 0.01231289 0.01262498] mean value: 0.014643549919128418 key: test_mcc value: [0.78947368 0.79388419 0.68421053 0.89473684 0.73786479 0.73786479 0.85280287 0.84327404 0.89736456 0.51319869] mean value: 0.7744674971633342 key: train_mcc value: [1. 0.90001557 0.88825066 0.87648575 0.82942611 0.98236994 0.88235294 0.82375747 0.77139024 0.82404541] mean value: 0.8778094100190499 key: test_accuracy value: [0.89473684 0.89473684 0.84210526 0.94736842 0.86842105 0.86842105 0.92105263 0.92105263 0.94594595 0.75675676] mean value: 0.8860597439544808 key: train_accuracy value: [1. 0.95 0.94411765 0.93823529 0.91470588 0.99117647 0.94117647 0.91176471 0.8856305 0.91202346] mean value: 0.9388830429532516 key: test_fscore value: [0.89473684 0.9 0.84210526 0.94736842 0.86486486 0.87179487 0.91428571 0.91891892 0.94736842 0.76923077] mean value: 0.887067408646356 key: train_fscore value: [1. 0.94985251 0.9439528 0.9380531 0.91445428 0.99115044 0.94117647 0.9127907 0.88495575 0.91176471] mean value: 0.9388150753201054 key: test_precision value: [0.89473684 0.85714286 0.84210526 0.94736842 0.88888889 0.85 1. 0.94444444 0.9 0.75 ] mean value: 0.887468671679198 key: train_precision value: [1. 0.95266272 0.94674556 0.9408284 0.91715976 0.99408284 0.94117647 0.90229885 0.89285714 0.91176471] mean value: 0.9399576459843272 key: test_recall value: [0.89473684 0.94736842 0.84210526 0.94736842 0.84210526 0.89473684 0.84210526 0.89473684 1. 0.78947368] mean value: 0.8894736842105263 key: train_recall value: [1. 0.94705882 0.94117647 0.93529412 0.91176471 0.98823529 0.94117647 0.92352941 0.87719298 0.91176471] mean value: 0.9377192982456141 key: test_roc_auc value: [0.89473684 0.89473684 0.84210526 0.94736842 0.86842105 0.86842105 0.92105263 0.92105263 0.94736842 0.75584795] mean value: 0.8861111111111111 key: train_roc_auc value: [1. 0.95 0.94411765 0.93823529 0.91470588 0.99117647 0.94117647 0.91176471 0.88565531 0.9120227 ] mean value: 0.9388854489164087 key: test_jcc value: [0.80952381 0.81818182 0.72727273 0.9 0.76190476 0.77272727 0.84210526 0.85 0.9 0.625 ] mean value: 0.8006715652768285 key: train_jcc value: [1. 0.90449438 0.89385475 0.88333333 0.8423913 0.98245614 0.88888889 0.83957219 0.79365079 0.83783784] mean value: 0.886647962154875 MCC on Blind test: 0.76 Accuracy on Blind test: 0.88 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01427627 0.01156187 0.01119947 0.01072907 0.01078057 0.01065731 0.01075244 0.01101494 0.01063251 0.01074815] mean value: 0.011235260963439941 key: score_time value: [0.0127666 0.01023865 0.01028371 0.00991011 0.00974965 0.00975084 0.00979042 0.00979543 0.00977588 0.00977397] mean value: 0.01018352508544922 key: test_mcc value: [0.63245553 0.54554473 0.59222009 0.59222009 0.58218174 0.37686733 0.85280287 0.63245553 0.73020842 0.62280702] mean value: 0.6159763343462977 key: train_mcc value: [0.67627507 0.65158377 0.68169173 0.62705429 0.65254612 0.6500365 0.62983758 0.67087425 0.64290652 0.64159047] mean value: 0.6524396309639473 key: test_accuracy value: [0.81578947 0.76315789 0.78947368 0.78947368 0.78947368 0.68421053 0.92105263 0.81578947 0.86486486 0.81081081] mean value: 0.8044096728307255 key: train_accuracy value: [0.83529412 0.82352941 0.83823529 0.81176471 0.82352941 0.81764706 0.81176471 0.83235294 0.81818182 0.81818182] mean value: 0.8230481283422459 key: test_fscore value: [0.81081081 0.72727273 0.76470588 0.76470588 0.77777778 0.64705882 0.91428571 0.81081081 0.85714286 0.81081081] mean value: 0.7885382097146804 key: train_fscore value: [0.82389937 0.8125 0.82758621 0.80124224 0.81132075 0.79605263 0.79746835 0.82018927 0.80503145 0.80503145] mean value: 0.8100321722246597 key: test_precision value: [0.83333333 0.85714286 0.86666667 0.86666667 0.82352941 0.73333333 1. 0.83333333 0.88235294 0.83333333] mean value: 0.85296918767507 key: train_precision value: [0.88513514 0.86666667 0.88590604 0.84868421 0.87162162 0.90298507 0.8630137 0.88435374 0.8707483 0.86486486] mean value: 0.874397935315639 key: test_recall value: [0.78947368 0.63157895 0.68421053 0.68421053 0.73684211 0.57894737 0.84210526 0.78947368 0.83333333 0.78947368] mean value: 0.7359649122807017 key: train_recall value: [0.77058824 0.76470588 0.77647059 0.75882353 0.75882353 0.71176471 0.74117647 0.76470588 0.74853801 0.75294118] mean value: 0.7548538011695907 key: test_roc_auc value: [0.81578947 0.76315789 0.78947368 0.78947368 0.78947368 0.68421053 0.92105263 0.81578947 0.86403509 0.81140351] mean value: 0.8043859649122806 key: train_roc_auc value: [0.83529412 0.82352941 0.83823529 0.81176471 0.82352941 0.81764706 0.81176471 0.83235294 0.81838665 0.81799106] mean value: 0.8230495356037152 key: test_jcc value: [0.68181818 0.57142857 0.61904762 0.61904762 0.63636364 0.47826087 0.84210526 0.68181818 0.75 0.68181818] mean value: 0.6561708124065103 key: train_jcc value: [0.70053476 0.68421053 0.70588235 0.66839378 0.68253968 0.66120219 0.66315789 0.69518717 0.67368421 0.67368421] mean value: 0.6808476770895582 MCC on Blind test: 0.68 Accuracy on Blind test: 0.84 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01116753 0.01096296 0.01096344 0.01117587 0.01004553 0.01000977 0.01028728 0.01055145 0.0107522 0.01096797] mean value: 0.010688400268554688 key: score_time value: [0.0098331 0.00980783 0.00969982 0.01029825 0.00925255 0.00957799 0.00884676 0.008991 0.009202 0.00951934] mean value: 0.009502863883972168 key: test_mcc value: [0.52704628 0.47633051 0.63245553 0.63245553 0.73786479 0.52704628 0.9486833 0.58218174 0.89736456 0.35558302] mean value: 0.6317011536883512 key: train_mcc value: [0.74138173 0.74717517 0.73632672 0.72354193 0.71236887 0.72354193 0.71944168 0.72961376 0.72462581 0.75953765] mean value: 0.7317555250285934 key: test_accuracy value: [0.76315789 0.73684211 0.81578947 0.81578947 0.86842105 0.76315789 0.97368421 0.78947368 0.94594595 0.67567568] mean value: 0.8147937411095306 key: train_accuracy value: [0.87058824 0.87352941 0.86764706 0.86176471 0.85588235 0.86176471 0.85882353 0.86470588 0.86217009 0.8797654 ] mean value: 0.865664136622391 key: test_fscore value: [0.75675676 0.75 0.81081081 0.82051282 0.86486486 0.75675676 0.97297297 0.8 0.94736842 0.71428571] mean value: 0.8194329118013328 key: train_fscore value: [0.87209302 0.87463557 0.87106017 0.86217009 0.85878963 0.86217009 0.86363636 0.86627907 0.86455331 0.87905605] mean value: 0.8674443359724497 key: test_precision value: [0.77777778 0.71428571 0.83333333 0.8 0.88888889 0.77777778 1. 0.76190476 0.9 0.65217391] mean value: 0.8106142167011732 key: train_precision value: [0.86206897 0.86705202 0.84916201 0.85964912 0.84180791 0.85964912 0.83516484 0.85632184 0.85227273 0.8816568 ] mean value: 0.8564805361282117 key: test_recall value: [0.73684211 0.78947368 0.78947368 0.84210526 0.84210526 0.73684211 0.94736842 0.84210526 1. 0.78947368] mean value: 0.831578947368421 key: train_recall value: [0.88235294 0.88235294 0.89411765 0.86470588 0.87647059 0.86470588 0.89411765 0.87647059 0.87719298 0.87647059] mean value: 0.8788957688338493 key: test_roc_auc value: [0.76315789 0.73684211 0.81578947 0.81578947 0.86842105 0.76315789 0.97368421 0.78947368 0.94736842 0.67251462] mean value: 0.8146198830409357 key: train_roc_auc value: [0.87058824 0.87352941 0.86764706 0.86176471 0.85588235 0.86176471 0.85882353 0.86470588 0.8621259 0.87975576] mean value: 0.8656587547299621 key: test_jcc value: [0.60869565 0.6 0.68181818 0.69565217 0.76190476 0.60869565 0.94736842 0.66666667 0.9 0.55555556] mean value: 0.7026357065258667 key: train_jcc value: [0.77319588 0.77720207 0.7715736 0.75773196 0.75252525 0.75773196 0.76 0.76410256 0.76142132 0.78421053] mean value: 0.7659695133154767 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01076102 0.01020241 0.01022601 0.01018453 0.01024795 0.01016283 0.01031971 0.0109694 0.01042318 0.01042962] mean value: 0.01039266586303711 key: score_time value: [0.01769781 0.0165956 0.01196361 0.01205826 0.01194811 0.01616716 0.01586628 0.01836443 0.01750278 0.01775217] mean value: 0.015591621398925781 key: test_mcc value: [0.26462806 0.05383819 0.58218174 0.53300179 0.42163702 0.53300179 0.59222009 0.47368421 0.62280702 0.40469382] mean value: 0.44816937343834784 key: train_mcc value: [0.65385813 0.68277833 0.68235294 0.64127633 0.65322377 0.65304287 0.62968418 0.67100629 0.66019502 0.69541138] mean value: 0.6622829245583861 key: test_accuracy value: [0.63157895 0.52631579 0.78947368 0.76315789 0.71052632 0.76315789 0.78947368 0.73684211 0.81081081 0.7027027 ] mean value: 0.7224039829302987 key: train_accuracy value: [0.82647059 0.84117647 0.84117647 0.82058824 0.82647059 0.82647059 0.81470588 0.83529412 0.82991202 0.84750733] mean value: 0.830977229601518 key: test_fscore value: [0.65 0.47058824 0.77777778 0.74285714 0.71794872 0.74285714 0.76470588 0.73684211 0.81081081 0.71794872] mean value: 0.7132336533110527 key: train_fscore value: [0.82175227 0.84393064 0.84117647 0.8189911 0.82898551 0.82492582 0.8173913 0.83233533 0.83333333 0.84431138] mean value: 0.8307133137748363 key: test_precision value: [0.61904762 0.53333333 0.82352941 0.8125 0.7 0.8125 0.86666667 0.73684211 0.78947368 0.7 ] mean value: 0.7393892820286009 key: train_precision value: [0.8447205 0.82954545 0.84117647 0.82634731 0.81714286 0.83233533 0.80571429 0.84756098 0.81920904 0.8597561 ] mean value: 0.8323508312334535 key: test_recall value: [0.68421053 0.42105263 0.73684211 0.68421053 0.73684211 0.68421053 0.68421053 0.73684211 0.83333333 0.73684211] mean value: 0.6938596491228071 key: train_recall value: [0.8 0.85882353 0.84117647 0.81176471 0.84117647 0.81764706 0.82941176 0.81764706 0.84795322 0.82941176] mean value: 0.8295012039903681 key: test_roc_auc value: [0.63157895 0.52631579 0.78947368 0.76315789 0.71052632 0.76315789 0.78947368 0.73684211 0.81140351 0.70175439] mean value: 0.7223684210526315 key: train_roc_auc value: [0.82647059 0.84117647 0.84117647 0.82058824 0.82647059 0.82647059 0.81470588 0.83529412 0.82985896 0.84745442] mean value: 0.8309666322669419 key: test_jcc value: [0.48148148 0.30769231 0.63636364 0.59090909 0.56 0.59090909 0.61904762 0.58333333 0.68181818 0.56 ] mean value: 0.5611554741554742 key: train_jcc value: [0.6974359 0.73 0.72588832 0.69346734 0.70792079 0.7020202 0.69117647 0.71282051 0.71428571 0.73056995] mean value: 0.7105585198972811 MCC on Blind test: 0.5 Accuracy on Blind test: 0.75 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01894116 0.01715732 0.01883602 0.01850367 0.01840329 0.01838541 0.01891804 0.01576376 0.01587749 0.01550865] mean value: 0.017629480361938475 key: score_time value: [0.01178336 0.01053977 0.01157427 0.01156521 0.01155615 0.01190138 0.01167703 0.01049495 0.01038718 0.01023459] mean value: 0.011171388626098632 key: test_mcc value: [0.78947368 0.79388419 0.68421053 0.84327404 0.79388419 0.68421053 0.9486833 0.84327404 0.89736456 0.40469382] mean value: 0.768295287708455 key: train_mcc value: [0.78828985 0.79413139 0.8 0.78828985 0.80600787 0.78828985 0.79424133 0.81227078 0.78299907 0.82992191] mean value: 0.798444188462922 key: test_accuracy value: [0.89473684 0.89473684 0.84210526 0.92105263 0.89473684 0.84210526 0.97368421 0.92105263 0.94594595 0.7027027 ] mean value: 0.8832859174964438 key: train_accuracy value: [0.89411765 0.89705882 0.9 0.89411765 0.90294118 0.89411765 0.89705882 0.90588235 0.8914956 0.91495601] mean value: 0.8991745730550285 key: test_fscore value: [0.89473684 0.9 0.84210526 0.91891892 0.88888889 0.84210526 0.97297297 0.91891892 0.94736842 0.71794872] mean value: 0.8843964207122101 key: train_fscore value: [0.89473684 0.8973607 0.9 0.89473684 0.90379009 0.89349112 0.89795918 0.90751445 0.89212828 0.91445428] mean value: 0.8996171791456794 key: test_precision value: [0.89473684 0.85714286 0.84210526 0.94444444 0.94117647 0.84210526 1. 0.94444444 0.9 0.7 ] mean value: 0.8866155585041033 key: train_precision value: [0.88953488 0.89473684 0.9 0.88953488 0.89595376 0.89880952 0.89017341 0.89204545 0.88953488 0.91715976] mean value: 0.8957483402566699 key: test_recall value: [0.89473684 0.94736842 0.84210526 0.89473684 0.84210526 0.84210526 0.94736842 0.89473684 1. 0.73684211] mean value: 0.8842105263157894 key: train_recall value: [0.9 0.9 0.9 0.9 0.91176471 0.88823529 0.90588235 0.92352941 0.89473684 0.91176471] mean value: 0.9035913312693499 key: test_roc_auc value: [0.89473684 0.89473684 0.84210526 0.92105263 0.89473684 0.84210526 0.97368421 0.92105263 0.94736842 0.70175439] mean value: 0.8833333333333333 key: train_roc_auc value: [0.89411765 0.89705882 0.9 0.89411765 0.90294118 0.89411765 0.89705882 0.90588235 0.89148607 0.91494668] mean value: 0.899172686618507 key: test_jcc value: [0.80952381 0.81818182 0.72727273 0.85 0.8 0.72727273 0.94736842 0.85 0.9 0.56 ] mean value: 0.7989619503303714 key: train_jcc value: [0.80952381 0.81382979 0.81818182 0.80952381 0.82446809 0.80748663 0.81481481 0.83068783 0.80526316 0.8423913 ] mean value: 0.8176171048331113 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.88404822 2.01980352 1.86825657 1.99386954 1.89242554 2.02484608 1.87313199 1.85225391 2.03934073 1.26169872] mean value: 1.870967483520508 key: score_time value: [0.05082941 0.02140141 0.01251268 0.01486778 0.02660775 0.01820397 0.01507092 0.01291871 0.01262975 0.01267314] mean value: 0.019771552085876463 key: test_mcc value: [0.78947368 0.69989647 0.73786479 0.9486833 0.9486833 0.78947368 0.89973541 0.79388419 0.89736456 0.56725146] mean value: 0.8072310845862681 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.89473684 0.84210526 0.86842105 0.97368421 0.97368421 0.89473684 0.94736842 0.89473684 0.94594595 0.78378378] mean value: 0.9019203413940255 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.89473684 0.85714286 0.87179487 0.97435897 0.97435897 0.89473684 0.94444444 0.88888889 0.94736842 0.78947368] mean value: 0.9037304800462695 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.89473684 0.7826087 0.85 0.95 0.95 0.89473684 1. 0.94117647 0.9 0.78947368] mean value: 0.8952732534661462 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.89473684 0.94736842 0.89473684 1. 1. 0.89473684 0.89473684 0.84210526 1. 0.78947368] mean value: 0.9157894736842105 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.89473684 0.84210526 0.86842105 0.97368421 0.97368421 0.89473684 0.94736842 0.89473684 0.94736842 0.78362573] mean value: 0.902046783625731 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.80952381 0.75 0.77272727 0.95 0.95 0.80952381 0.89473684 0.8 0.9 0.65217391] mean value: 0.8288685646923634 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02012491 0.01943636 0.01589656 0.01583838 0.01620412 0.01516843 0.01706886 0.01599669 0.01592517 0.01600361] mean value: 0.01676630973815918 key: score_time value: [0.01256251 0.01115823 0.00905514 0.00882006 0.00896192 0.00889635 0.00920773 0.00900817 0.0087254 0.00878215] mean value: 0.009517765045166016 key: test_mcc value: [0.89973541 0.89973541 0.9486833 0.84327404 0.89473684 0.89473684 1. 0.84327404 0.94736842 0.83871328] mean value: 0.9010257594256045 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.94736842 0.97368421 0.92105263 0.94736842 0.94736842 1. 0.92105263 0.97297297 0.91891892] mean value: 0.9497155049786629 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94444444 0.95 0.97435897 0.92307692 0.94736842 0.94736842 1. 0.91891892 0.97297297 0.92307692] mean value: 0.950158599895442 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9047619 0.95 0.9 0.94736842 0.94736842 1. 0.94444444 0.94736842 0.9 ] mean value: 0.9441311612364244 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.89473684 1. 1. 0.94736842 0.94736842 0.94736842 1. 0.89473684 1. 0.94736842] mean value: 0.9578947368421052 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.94736842 0.97368421 0.92105263 0.94736842 0.94736842 1. 0.92105263 0.97368421 0.91812865] mean value: 0.9497076023391813 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.89473684 0.9047619 0.95 0.85714286 0.9 0.9 1. 0.85 0.94736842 0.85714286] mean value: 0.9061152882205513 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.10629559 0.10595846 0.10605121 0.10595107 0.10800147 0.10813618 0.11086369 0.11145663 0.11341166 0.11815166] mean value: 0.10942776203155517 key: score_time value: [0.01755023 0.01762581 0.01783776 0.01773572 0.01776767 0.01922417 0.01921582 0.01775551 0.01930189 0.0180676 ] mean value: 0.01820821762084961 key: test_mcc value: [0.78947368 0.63245553 0.73786479 0.89973541 0.9486833 0.73786479 0.89973541 0.74620251 0.84959079 0.62807634] mean value: 0.7869682546638045 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.89473684 0.81578947 0.86842105 0.94736842 0.97368421 0.86842105 0.94736842 0.86842105 0.91891892 0.81081081] mean value: 0.8913940256045519 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.89473684 0.81081081 0.87179487 0.95 0.97435897 0.87179487 0.94444444 0.85714286 0.92307692 0.82926829] mean value: 0.8927428888211943 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.89473684 0.83333333 0.85 0.9047619 0.95 0.85 1. 0.9375 0.85714286 0.77272727] mean value: 0.8850202210070631 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.89473684 0.78947368 0.89473684 1. 1. 0.89473684 0.89473684 0.78947368 1. 0.89473684] mean value: 0.9052631578947369 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.89473684 0.81578947 0.86842105 0.94736842 0.97368421 0.86842105 0.94736842 0.86842105 0.92105263 0.80847953] mean value: 0.8913742690058479 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.80952381 0.68181818 0.77272727 0.9047619 0.95 0.77272727 0.89473684 0.75 0.85714286 0.70833333] mean value: 0.8101771474139895 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.78 Accuracy on Blind test: 0.89 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00998735 0.01076865 0.01041269 0.01028037 0.00994205 0.01035309 0.01037288 0.01094675 0.01107693 0.01506305] mean value: 0.010920381546020508 key: score_time value: [0.00932503 0.00965333 0.00984311 0.00884271 0.00942039 0.00911498 0.01005459 0.00995064 0.00970674 0.011832 ] mean value: 0.009774351119995117 key: test_mcc value: [0.47633051 0.52704628 0.73786479 0.68421053 0.42163702 0.36842105 0.52704628 0.68421053 0.40469382 0.46019501] mean value: 0.529165581278633 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.73684211 0.76315789 0.86842105 0.84210526 0.71052632 0.68421053 0.76315789 0.84210526 0.7027027 0.72972973] mean value: 0.7642958748221906 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.75 0.75675676 0.87179487 0.84210526 0.7027027 0.68421053 0.76923077 0.84210526 0.68571429 0.75 ] mean value: 0.7654620438830966 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.71428571 0.77777778 0.85 0.84210526 0.72222222 0.68421053 0.75 0.84210526 0.70588235 0.71428571] mean value: 0.7602874834144184 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.78947368 0.73684211 0.89473684 0.84210526 0.68421053 0.68421053 0.78947368 0.84210526 0.66666667 0.78947368] mean value: 0.7719298245614035 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.73684211 0.76315789 0.86842105 0.84210526 0.71052632 0.68421053 0.76315789 0.84210526 0.70175439 0.72807018] mean value: 0.7640350877192983 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.6 0.60869565 0.77272727 0.72727273 0.54166667 0.52 0.625 0.72727273 0.52173913 0.6 ] mean value: 0.624437417654809 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.52 Accuracy on Blind test: 0.76 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.60144925 1.58419991 1.56108975 1.55858779 1.55152655 1.62324905 1.61374426 1.59573984 1.60749698 1.50386572] mean value: 1.5800949096679688 key: score_time value: [0.09293389 0.09812546 0.09960723 0.09873724 0.09930444 0.09915233 0.10042262 0.09998393 0.10088515 0.09459758] mean value: 0.09837498664855956 key: test_mcc value: [0.89473684 0.79388419 0.9486833 0.9486833 0.89973541 0.89473684 1. 0.84327404 0.94736842 0.78362573] mean value: 0.8954728071949787 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.89473684 0.97368421 0.97368421 0.94736842 0.94736842 1. 0.92105263 0.97297297 0.89189189] mean value: 0.9470128022759602 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.88888889 0.97435897 0.97435897 0.95 0.94736842 1. 0.91891892 0.97297297 0.89473684] mean value: 0.9468972413709256 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.94736842 0.94117647 0.95 0.95 0.9047619 0.94736842 1. 0.94444444 0.94736842 0.89473684] mean value: 0.9427224925057742 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 0.84210526 1. 1. 1. 0.94736842 1. 0.89473684 1. 0.89473684] mean value: 0.9526315789473684 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94736842 0.89473684 0.97368421 0.97368421 0.94736842 0.94736842 1. 0.92105263 0.97368421 0.89181287] mean value: 0.9470760233918128 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.8 0.95 0.95 0.9047619 0.9 1. 0.85 0.94736842 0.80952381] mean value: 0.9011654135338346 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.94246864 0.95814466 0.88224459 0.89182115 0.94442129 1.0197804 0.91957068 0.97436619 0.95079732 0.92337275] mean value: 0.940698766708374 key: score_time value: [0.19855642 0.15321207 0.25342155 0.13341498 0.17170119 0.26337838 0.17836618 0.23186874 0.27052569 0.24460649] mean value: 0.20990517139434814 key: test_mcc value: [0.89473684 0.79388419 0.9486833 0.9486833 0.89473684 0.84327404 0.9486833 0.84327404 1. 0.73020842] mean value: 0.8846164268265851 key: train_mcc value: [0.96470588 0.95884012 0.96477265 0.95884012 0.95300713 0.96470588 0.95884012 0.95294118 0.95314596 0.97069143] mean value: 0.9600490476456546 key: test_accuracy value: [0.94736842 0.89473684 0.97368421 0.97368421 0.94736842 0.92105263 0.97368421 0.92105263 1. 0.86486486] mean value: 0.9417496443812233 key: train_accuracy value: [0.98235294 0.97941176 0.98235294 0.97941176 0.97647059 0.98235294 0.97941176 0.97647059 0.97653959 0.98533724] mean value: 0.9800112126962222 key: test_fscore value: [0.94736842 0.88888889 0.97435897 0.97435897 0.94736842 0.92307692 0.97297297 0.91891892 1. 0.87179487] mean value: 0.9419107366475787 key: train_fscore value: [0.98235294 0.97935103 0.98224852 0.97935103 0.97633136 0.98235294 0.97935103 0.97647059 0.97647059 0.98533724] mean value: 0.9799617281227226 key: test_precision value: [0.94736842 0.94117647 0.95 0.95 0.94736842 0.9 1. 0.94444444 1. 0.85 ] mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( 0.9430357757137943 key: train_precision value: [0.98235294 0.98224852 0.98809524 0.98224852 0.98214286 0.98235294 0.98224852 0.97647059 0.98224852 0.98245614] mean value: 0.9822864789017445 key: test_recall value: [0.94736842 0.84210526 1. 1. 0.94736842 0.94736842 0.94736842 0.89473684 1. 0.89473684] mean value: 0.9421052631578947 key: train_recall value: [0.98235294 0.97647059 0.97647059 0.97647059 0.97058824 0.98235294 0.97647059 0.97647059 0.97076023 0.98823529] mean value: 0.9776642586859305 key: test_roc_auc value: [0.94736842 0.89473684 0.97368421 0.97368421 0.94736842 0.92105263 0.97368421 0.92105263 1. 0.86403509] mean value: 0.9416666666666667 key: train_roc_auc value: [0.98235294 0.97941176 0.98235294 0.97941176 0.97647059 0.98235294 0.97941176 0.97647059 0.97655659 0.98534572] mean value: 0.9800137598899209 key: test_jcc value: [0.9 0.8 0.95 0.95 0.9 0.85714286 0.94736842 0.85 1. 0.77272727] mean value: 0.8927238550922761 key: train_jcc value: [0.96531792 0.95953757 0.96511628 0.95953757 0.95375723 0.96531792 0.95953757 0.95402299 0.95402299 0.97109827] mean value: 0.9607266302324036 MCC on Blind test: 0.85 Accuracy on Blind test: 0.92 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01481676 0.01259422 0.01158786 0.01182032 0.0115788 0.01470709 0.01181889 0.01124763 0.01149035 0.01684523] mean value: 0.012850713729858399 key: score_time value: [0.01577783 0.01042986 0.01050568 0.01003408 0.01580787 0.01024508 0.01042175 0.01017332 0.01543117 0.01240611] mean value: 0.01212327480316162 key: test_mcc value: [0.52704628 0.47633051 0.63245553 0.63245553 0.73786479 0.52704628 0.9486833 0.58218174 0.89736456 0.35558302] mean value: 0.6317011536883512 key: train_mcc value: [0.74138173 0.74717517 0.73632672 0.72354193 0.71236887 0.72354193 0.71944168 0.72961376 0.72462581 0.75953765] mean value: 0.7317555250285934 key: test_accuracy value: [0.76315789 0.73684211 0.81578947 0.81578947 0.86842105 0.76315789 0.97368421 0.78947368 0.94594595 0.67567568] mean value: 0.8147937411095306 key: train_accuracy value: [0.87058824 0.87352941 0.86764706 0.86176471 0.85588235 0.86176471 0.85882353 0.86470588 0.86217009 0.8797654 ] mean value: 0.865664136622391 key: test_fscore value: [0.75675676 0.75 0.81081081 0.82051282 0.86486486 0.75675676 0.97297297 0.8 0.94736842 0.71428571] mean value: 0.8194329118013328 key: train_fscore value: [0.87209302 0.87463557 0.87106017 0.86217009 0.85878963 0.86217009 0.86363636 0.86627907 0.86455331 0.87905605] mean value: 0.8674443359724497 key: test_precision value: [0.77777778 0.71428571 0.83333333 0.8 0.88888889 0.77777778 1. 0.76190476 0.9 0.65217391] mean value: 0.8106142167011732 key: train_precision value: [0.86206897 0.86705202 0.84916201 0.85964912 0.84180791 0.85964912 0.83516484 0.85632184 0.85227273 0.8816568 ] mean value: 0.8564805361282117 key: test_recall value: [0.73684211 0.78947368 0.78947368 0.84210526 0.84210526 0.73684211 0.94736842 0.84210526 1. 0.78947368] mean value: 0.831578947368421 key: train_recall value: [0.88235294 0.88235294 0.89411765 0.86470588 0.87647059 0.86470588 0.89411765 0.87647059 0.87719298 0.87647059] mean value: 0.8788957688338493 key: test_roc_auc value: [0.76315789 0.73684211 0.81578947 0.81578947 0.86842105 0.76315789 0.97368421 0.78947368 0.94736842 0.67251462] mean value: 0.8146198830409357 key: train_roc_auc value: [0.87058824 0.87352941 0.86764706 0.86176471 0.85588235 0.86176471 0.85882353 0.86470588 0.8621259 0.87975576] mean value: 0.8656587547299621 key: test_jcc value: [0.60869565 0.6 0.68181818 0.69565217 0.76190476 0.60869565 0.94736842 0.66666667 0.9 0.55555556] mean value: 0.7026357065258667 key: train_jcc value: [0.77319588 0.77720207 0.7715736 0.75773196 0.75252525 0.75773196 0.76 0.76410256 0.76142132 0.78421053] mean value: 0.7659695133154767 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.50253129 0.13468575 0.05374956 0.13997626 0.35016894 0.12072372 0.05626488 0.16596508 0.05752921 0.05376387] mean value: 0.16353585720062255 key: score_time value: [0.01153541 0.01105118 0.01070404 0.01191998 0.01194215 0.01204967 0.01075387 0.01131034 0.01075554 0.01144719] mean value: 0.011346936225891113 key: test_mcc value: [0.9486833 0.89973541 0.89473684 0.89973541 1. 0.9486833 0.9486833 0.9486833 1. 0.78362573] mean value: 0.9272566586986345 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.97368421 0.97368421 1. 0.89189189] mean value: 0.962873399715505 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97297297 0.95 0.94736842 0.95 1. 0.97297297 0.97297297 0.97435897 1. 0.89473684] mean value: 0.9635383156435788 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9047619 0.94736842 0.9047619 1. 1. 1. 0.95 1. 0.89473684] mean value: 0.9601629072681704 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 1. 0.94736842 1. 1. 0.94736842 0.94736842 1. 1. 0.89473684] mean value: 0.968421052631579 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.97368421 0.97368421 1. 0.89181287] mean value: 0.9628654970760233 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94736842 0.9047619 0.9 0.9047619 1. 0.94736842 0.94736842 0.95 1. 0.80952381] mean value: 0.9311152882205513 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.05968142 0.05620432 0.04024172 0.05535245 0.06633306 0.06649137 0.05653095 0.0731082 0.0600152 0.07051778] mean value: 0.06044764518737793 key: score_time value: [0.02383661 0.01224399 0.01207924 0.01229954 0.02160764 0.03937793 0.02177739 0.01230597 0.02152944 0.01230001] mean value: 0.018935775756835936 key: test_mcc value: [0.78947368 0.89473684 0.58218174 0.79388419 0.79388419 0.63245553 0.89973541 0.73786479 0.78764146 0.62280702] mean value: 0.753466484462356 key: train_mcc value: [0.94720632 0.93543979 0.92941176 0.95294118 0.94143711 0.95897286 0.94707521 0.94117647 0.94723082 0.93550052] mean value: 0.9436392042730108 key: test_accuracy value: [0.89473684 0.94736842 0.78947368 0.89473684 0.89473684 0.81578947 0.94736842 0.86842105 0.89189189 0.81081081] mean value: 0.8755334281650071 key: train_accuracy value: [0.97352941 0.96764706 0.96470588 0.97647059 0.97058824 0.97941176 0.97352941 0.97058824 0.97360704 0.96774194] mean value: 0.9717819561842332 key: test_fscore value: [0.89473684 0.94736842 0.77777778 0.9 0.9 0.82051282 0.94444444 0.87179487 0.88235294 0.81081081] mean value: 0.8749798929675091 key: train_fscore value: [0.97329377 0.96735905 0.96470588 0.97647059 0.9702381 0.97922849 0.97345133 0.97058824 0.97360704 0.96774194] mean value: 0.9716684407799097 key: test_precision value: [0.89473684 0.94736842 0.82352941 0.85714286 0.85714286 0.8 1. 0.85 0.9375 0.83333333] mean value: 0.8800753722541648 key: train_precision value: [0.98203593 0.9760479 0.96470588 0.97647059 0.98192771 0.98802395 0.97633136 0.97058824 0.97647059 0.96491228] mean value: 0.9757514431040658 key: test_recall value: [0.89473684 0.94736842 0.73684211 0.94736842 0.94736842 0.84210526 0.89473684 0.89473684 0.83333333 0.78947368] mean value: 0.8728070175438596 key: train_recall value: [0.96470588 0.95882353 0.96470588 0.97647059 0.95882353 0.97058824 0.97058824 0.97058824 0.97076023 0.97058824] mean value: 0.9676642586859305 key: test_roc_auc value: [0.89473684 0.94736842 0.78947368 0.89473684 0.89473684 0.81578947 0.94736842 0.86842105 0.89035088 0.81140351] mean value: 0.875438596491228 key: train_roc_auc value: [0.97352941 0.96764706 0.96470588 0.97647059 0.97058824 0.97941176 0.97352941 0.97058824 0.97361541 0.96775026] mean value: 0.9717836257309942 key: test_jcc value: [0.80952381 0.9 0.63636364 0.81818182 0.81818182 0.69565217 0.89473684 0.77272727 0.78947368 0.68181818] mean value: 0.781665923702537 key: train_jcc value: [0.94797688 0.93678161 0.93181818 0.95402299 0.94219653 0.95930233 0.94827586 0.94285714 0.94857143 0.9375 ] mean value: 0.9449302949002888 MCC on Blind test: 0.72 Accuracy on Blind test: 0.86 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.03640914 0.01111817 0.0107789 0.0103848 0.01065993 0.01029205 0.0105691 0.01045561 0.00946641 0.00948262] mean value: 0.012961673736572265 key: score_time value: [0.01239967 0.00975752 0.0088954 0.00995421 0.00950813 0.01114082 0.00941396 0.00899124 0.00900888 0.00877619] mean value: 0.009784603118896484 key: test_mcc value: [0.79388419 0.59222009 0.78947368 0.68421053 0.69989647 0.37047929 0.89473684 0.52704628 0.89736456 0.29618896] mean value: 0.6545500886423572 key: train_mcc value: [0.72354193 0.70607783 0.73540864 0.64777648 0.62994603 0.72414357 0.67676337 0.67100629 0.66600142 0.70748143] mean value: 0.6888146983674961 key: test_accuracy value: [0.89473684 0.78947368 0.89473684 0.84210526 0.84210526 0.68421053 0.94736842 0.76315789 0.94594595 0.64864865] mean value: 0.82524893314367 key: train_accuracy value: [0.86176471 0.85294118 0.86764706 0.82352941 0.81470588 0.86176471 0.83823529 0.83529412 0.83284457 0.85337243] mean value: 0.8442099361738831 key: test_fscore value: [0.88888889 0.76470588 0.89473684 0.84210526 0.82352941 0.66666667 0.94736842 0.76923077 0.94736842 0.66666667] mean value: 0.821126723293906 key: train_fscore value: [0.86135693 0.85119048 0.86646884 0.81927711 0.81081081 0.85885886 0.8358209 0.83233533 0.83086053 0.84939759] mean value: 0.8416377378527024 key: test_precision value: [0.94117647 0.86666667 0.89473684 0.84210526 0.93333333 0.70588235 0.94736842 0.75 0.9 0.65 ] mean value: 0.8431269349845201 key: train_precision value: [0.86390533 0.86144578 0.8742515 0.83950617 0.82822086 0.87730061 0.84848485 0.84756098 0.84337349 0.87037037] mean value: 0.8554419939255328 key: test_recall value: [0.84210526 0.68421053 0.89473684 0.84210526 0.73684211 0.63157895 0.94736842 0.78947368 1. 0.68421053] mean value: 0.8052631578947368 key: train_recall value: [0.85882353 0.84117647 0.85882353 0.8 0.79411765 0.84117647 0.82352941 0.81764706 0.81871345 0.82941176] mean value: 0.8283419332645339 key: test_roc_auc value: [0.89473684 0.78947368 0.89473684 0.84210526 0.84210526 0.68421053 0.94736842 0.76315789 0.94736842 0.64766082] mean value: 0.8252923976608187 key: train_roc_auc value: [0.86176471 0.85294118 0.86764706 0.82352941 0.81470588 0.86176471 0.83823529 0.83529412 0.83288614 0.85330237] mean value: 0.8442070863433093 key: test_jcc value: [0.8 0.61904762 0.80952381 0.72727273 0.7 0.5 0.9 0.625 0.9 0.5 ] mean value: 0.7080844155844156 key: train_jcc value: [0.75647668 0.74093264 0.76439791 0.69387755 0.68181818 0.75263158 0.71794872 0.71282051 0.7106599 0.7382199 ] mean value: 0.7269783568504338 MCC on Blind test: 0.77 Accuracy on Blind test: 0.89 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01160216 0.01595926 0.01850629 0.01905012 0.01691532 0.01988149 0.02177978 0.01703143 0.01949835 0.01658297] mean value: 0.017680716514587403 key: score_time value: [0.0089817 0.01127219 0.01129818 0.01176715 0.01182032 0.01188231 0.01186323 0.01187611 0.01192379 0.01185489] mean value: 0.011453986167907715 key: test_mcc value: [0.61017022 0.84327404 0.68803296 0.80757285 0.80757285 0.74620251 0.9486833 0.54554473 0.89736456 0.57857577] mean value: 0.7472993786374361 key: train_mcc value: [0.78047467 0.86724532 0.88861973 0.87273022 0.81456113 0.81962489 0.84766884 0.72101498 0.90106396 0.85672926] mean value: 0.8369732992272605 key: test_accuracy value: [0.78947368 0.92105263 0.84210526 0.89473684 0.89473684 0.86842105 0.97368421 0.76315789 0.94594595 0.78378378] mean value: 0.8677098150782361 key: train_accuracy value: [0.88235294 0.93235294 0.94411765 0.93235294 0.9 0.90294118 0.92058824 0.84705882 0.95014663 0.92668622] mean value: 0.9138597550457133 key: test_fscore value: [0.81818182 0.92307692 0.83333333 0.88235294 0.88235294 0.87804878 0.97435897 0.79069767 0.94736842 0.80952381] mean value: 0.8739295616786841 key: train_fscore value: [0.89304813 0.92966361 0.94328358 0.92744479 0.88961039 0.91105121 0.92520776 0.86528497 0.94925373 0.92957746] mean value: 0.9163425642953533 key: test_precision value: [0.72 0.9 0.88235294 1. 1. 0.81818182 0.95 0.70833333 0.9 0.73913043] mean value: 0.8617998527474231 key: train_precision value: [0.81862745 0.96815287 0.95757576 1. 0.99275362 0.84079602 0.87434555 0.77314815 0.9695122 0.89189189] mean value: 0.9086803502787302 key: test_recall value: [0.94736842 0.94736842 0.78947368 0.78947368 0.78947368 0.94736842 1. 0.89473684 1. 0.89473684] mean value: 0.9 key: train_recall value: [0.98235294 0.89411765 0.92941176 0.86470588 0.80588235 0.99411765 0.98235294 0.98235294 0.92982456 0.97058824] mean value: 0.9335706914344686 key: test_roc_auc value: [0.78947368 0.92105263 0.84210526 0.89473684 0.89473684 0.86842105 0.97368421 0.76315789 0.94736842 0.78070175] mean value: 0.8675438596491228 key: train_roc_auc value: [0.88235294 0.93235294 0.94411765 0.93235294 0.9 0.90294118 0.92058824 0.84705882 0.9502064 0.92681459] mean value: 0.9138785689714483 key: test_jcc value: [0.69230769 0.85714286 0.71428571 0.78947368 0.78947368 0.7826087 0.95 0.65384615 0.9 0.68 ] mean value: 0.7809138481655644 key: train_jcc value: [0.80676329 0.86857143 0.89265537 0.86470588 0.80116959 0.83663366 0.86082474 0.76255708 0.90340909 0.86842105] mean value: 0.8465711180624056 MCC on Blind test: 0.75 Accuracy on Blind test: 0.88 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01726389 0.01823545 0.01617908 0.01863098 0.01907182 0.01757312 0.01636481 0.01706433 0.01851106 0.01773095] mean value: 0.017662549018859865 key: score_time value: [0.01184845 0.01182127 0.01185322 0.01189089 0.01183534 0.01188326 0.01175451 0.01184678 0.01183844 0.01181006] mean value: 0.011838221549987793 key: test_mcc value: [0.74620251 0.63245553 0.68803296 0.89973541 0.85280287 0.68803296 0.76376262 0.79388419 0.7163504 0.57857577] mean value: 0.7359835206381676 key: train_mcc value: [0.88333157 0.90352405 0.86508013 0.90014017 0.82470774 0.8452381 0.72433672 0.83493231 0.77515848 0.87706192] mean value: 0.8433511179735004 key: test_accuracy value: [0.86842105 0.81578947 0.84210526 0.94736842 0.92105263 0.84210526 0.86842105 0.89473684 0.83783784 0.78378378] mean value: 0.8621621621621621 key: train_accuracy value: [0.94117647 0.95 0.93235294 0.95 0.90588235 0.91764706 0.84411765 0.91176471 0.87683284 0.93548387] mean value: 0.916525789201311 key: test_fscore value: [0.85714286 0.82051282 0.85 0.95 0.91428571 0.83333333 0.84848485 0.88888889 0.85714286 0.80952381] mean value: 0.8629315129315129 key: train_fscore value: [0.93975904 0.94769231 0.93333333 0.95043732 0.89677419 0.91082803 0.81533101 0.90384615 0.89005236 0.93888889] mean value: 0.9126942623189517 key: test_precision value: [0.9375 0.8 0.80952381 0.9047619 1. 0.88235294 1. 0.94117647 0.75 0.73913043] mean value: 0.8764445560833029 key: train_precision value: [0.96296296 0.99354839 0.92 0.94219653 0.99285714 0.99305556 1. 0.99295775 0.8056872 0.88947368] mean value: 0.9492739214745212 key: test_recall value: [0.78947368 0.84210526 0.89473684 1. 0.84210526 0.78947368 0.73684211 0.84210526 1. 0.89473684] mean value: 0.8631578947368421 key: train_recall value: [0.91764706 0.90588235 0.94705882 0.95882353 0.81764706 0.84117647 0.68823529 0.82941176 0.99415205 0.99411765] mean value: 0.8894152046783625 key: test_roc_auc value: [0.86842105 0.81578947 0.84210526 0.94736842 0.92105263 0.84210526 0.86842105 0.89473684 0.84210526 0.78070175] mean value: 0.862280701754386 key: train_roc_auc value: [0.94117647 0.95 0.93235294 0.95 0.90588235 0.91764706 0.84411765 0.91176471 0.87648779 0.93565531] mean value: 0.9165084279325766 key: test_jcc value: [0.75 0.69565217 0.73913043 0.9047619 0.84210526 0.71428571 0.73684211 0.8 0.75 0.68 ] mean value: 0.7612777596164324 key: train_jcc value: [0.88636364 0.9005848 0.875 0.90555556 0.8128655 0.83625731 0.68823529 0.8245614 0.80188679 0.88481675] mean value: 0.8416127038264324 MCC on Blind test: 0.62 Accuracy on Blind test: 0.79 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.16265321 0.14861178 0.14626455 0.14584064 0.14930272 0.16142392 0.15244198 0.14463997 0.14828467 0.1534369 ] mean value: 0.15129003524780274 key: score_time value: [0.0151782 0.01543975 0.01532531 0.01555467 0.01592684 0.02432156 0.01536274 0.01528263 0.01656127 0.01543808] mean value: 0.016439104080200197 key: test_mcc value: [0.9486833 0.89973541 0.89473684 0.89973541 1. 0.9486833 0.9486833 0.9486833 1. 0.78764146] mean value: 0.927658231800502 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.97368421 0.97368421 1. 0.89189189] mean value: 0.962873399715505 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97297297 0.95 0.94736842 0.95 1. 0.97297297 0.97297297 0.97435897 1. 0.9 ] mean value: 0.9640646314330525 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9047619 0.94736842 0.9047619 1. 1. 1. 0.95 1. 0.85714286] mean value: 0.9564035087719298 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 1. 0.94736842 1. 1. 0.94736842 0.94736842 1. 1. 0.94736842] mean value: 0.9736842105263157 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.97368421 0.97368421 0.97368421 1. 0.89035088] mean value: 0.962719298245614 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94736842 0.9047619 0.9 0.9047619 1. 0.94736842 0.94736842 0.95 1. 0.81818182] mean value: 0.9319810890863522 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.92 Accuracy on Blind test: 0.96 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.04295039 0.05016208 0.03978586 0.04688597 0.05769682 0.05953455 0.07448912 0.0562222 0.04632425 0.03965807] mean value: 0.05137093067169189 key: score_time value: [0.02003407 0.02414131 0.01633906 0.02169681 0.02622557 0.02317142 0.02110815 0.03839302 0.02159262 0.02609897] mean value: 0.02388010025024414 key: test_mcc value: [0.9486833 0.78947368 0.9486833 0.89973541 1. 0.89473684 0.9486833 0.9486833 1. 0.83871328] mean value: 0.9217392413194645 key: train_mcc value: [0.98823529 0.98823529 0.99413485 1. 0.98823529 0.98236994 0.97653817 0.99413485 0.99415185 0.98826969] mean value: 0.9894305224350558 key: test_accuracy value: [0.97368421 0.89473684 0.97368421 0.94736842 1. 0.94736842 0.97368421 0.97368421 1. 0.91891892] mean value: 0.9603129445234708 key: train_accuracy value: [0.99411765 0.99411765 0.99705882 1. 0.99411765 0.99117647 0.98823529 0.99705882 0.99706745 0.9941349 ] mean value: 0.9947084698982233 key: test_fscore value: [0.97297297 0.89473684 0.97435897 0.95 1. 0.94736842 0.97297297 0.97435897 1. 0.92307692] mean value: 0.9609846080898713 key: train_fscore value: [0.99411765 0.99411765 0.99706745 1. 0.99411765 0.99120235 0.98816568 0.99706745 0.99708455 0.99411765] mean value: 0.9947058060215382 key: test_precision value: [1. 0.89473684 0.95 0.9047619 1. 0.94736842 1. 0.95 1. 0.9 ] mean value: 0.9546867167919799 key: train_precision value: [0.99411765 0.99411765 0.99415205 1. 0.99411765 0.98830409 0.99404762 0.99415205 0.99418605 0.99411765] mean value: 0.9941312440929044 key: test_recall value: [0.94736842 0.89473684 1. 1. 1. 0.94736842 0.94736842 1. 1. 0.94736842] mean value: 0.968421052631579 key: train_recall value: [0.99411765 0.99411765 1. 1. 0.99411765 0.99411765 0.98235294 1. 1. 0.99411765] mean value: 0.9952941176470589 key: test_roc_auc value: [0.97368421 0.89473684 0.97368421 0.94736842 1. 0.94736842 0.97368421 0.97368421 1. 0.91812865] mean value: 0.960233918128655 key: train_roc_auc value: [0.99411765 0.99411765 0.99705882 1. 0.99411765 0.99117647 0.98823529 0.99705882 0.99705882 0.99413485] mean value: 0.9947076023391813 key: test_jcc value: [0.94736842 0.80952381 0.95 0.9047619 1. 0.9 0.94736842 0.95 1. 0.85714286] mean value: 0.9266165413533834 key: train_jcc value: [0.98830409 0.98830409 0.99415205 1. 0.98830409 0.98255814 0.97660819 0.99415205 0.99418605 0.98830409] mean value: 0.9894872841017271 MCC on Blind test: 0.91 Accuracy on Blind test: 0.96 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.14919233 0.09951925 0.13426876 0.0898757 0.09314251 0.08426976 0.08239388 0.05807185 0.10029602 0.09456182] mean value: 0.09855918884277344 key: score_time value: [0.02806473 0.01825833 0.02253628 0.02233458 0.0233705 0.02221894 0.01384473 0.01433444 0.01381803 0.02184868] mean value: 0.020062923431396484 key: test_mcc value: [0.63245553 0.16151457 0.68803296 0.63245553 0.68421053 0.57894737 0.85280287 0.59222009 0.89181287 0.51319869] mean value: 0.6227651001498481 key: train_mcc value: [0.99413485 0.99413485 0.99413485 0.99413485 0.99413485 1. 0.99413485 1. 0.99415205 0.99415185] mean value: 0.9953112973615207 key: test_accuracy value: [0.81578947 0.57894737 0.84210526 0.81578947 0.84210526 0.78947368 0.92105263 0.78947368 0.94594595 0.75675676] mean value: 0.8097439544807966 key: train_accuracy value: [0.99705882 0.99705882 0.99705882 0.99705882 0.99705882 1. 0.99705882 1. 0.99706745 0.99706745] mean value: 0.9976487838537175 key: test_fscore value: [0.82051282 0.52941176 0.83333333 0.82051282 0.84210526 0.78947368 0.91428571 0.76470588 0.94444444 0.76923077] mean value: 0.8028016496747147 key: train_fscore value: [0.99705015 0.99705015 0.99705015 0.99705015 0.99705015 1. 0.99705015 1. 0.99706745 0.99705015] mean value: 0.9976418481128729 key: test_precision value: [0.8 0.6 0.88235294 0.8 0.84210526 0.78947368 1. 0.86666667 0.94444444 0.75 ] mean value: 0.8275042999656003 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.84210526 0.47368421 0.78947368 0.84210526 0.84210526 0.78947368 0.84210526 0.68421053 0.94444444 0.78947368] mean value: 0.7839181286549708 key: train_recall value: [0.99411765 0.99411765 0.99411765 0.99411765 0.99411765 1. 0.99411765 1. 0.99415205 0.99411765] mean value: 0.9952975576195391 key: test_roc_auc value: [0.81578947 0.57894737 0.84210526 0.81578947 0.84210526 0.78947368 0.92105263 0.78947368 0.94590643 0.75584795] mean value: 0.8096491228070175 key: train_roc_auc value: [0.99705882 0.99705882 0.99705882 0.99705882 0.99705882 1. 0.99705882 1. 0.99707602 0.99705882] mean value: 0.9976487788097695 key: test_jcc value: [0.69565217 0.36 0.71428571 0.69565217 0.72727273 0.65217391 0.84210526 0.61904762 0.89473684 0.625 ] mean value: 0.6825926426738784 key: train_jcc value: [0.99411765 0.99411765 0.99411765 0.99411765 0.99411765 1. 0.99411765 1. 0.99415205 0.99411765] mean value: 0.9952975576195391 MCC on Blind test: 0.54 Accuracy on Blind test: 0.77 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.6186831 0.62528348 0.54585361 0.54748797 0.58207774 0.56963778 0.5545404 0.55546618 0.54905748 0.55526996] mean value: 0.5703357696533203 key: score_time value: [0.01032495 0.00916743 0.00931358 0.01051068 0.00964808 0.00931263 0.01022315 0.00920653 0.00935125 0.00947666] mean value: 0.009653496742248534 key: test_mcc value: [0.9486833 0.89973541 0.89473684 0.89973541 1. 0.89473684 1. 0.89473684 0.94736842 0.83871328] mean value: 0.9218446350938172 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.94736842 1. 0.94736842 0.97297297 0.91891892] mean value: 0.9602418207681366 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97297297 0.95 0.94736842 0.95 1. 0.94736842 1. 0.94736842 0.97297297 0.92307692] mean value: 0.9611128132180764 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9047619 0.94736842 0.9047619 1. 0.94736842 1. 0.94736842 0.94736842 0.9 ] mean value: 0.9498997493734336 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.94736842 1. 0.94736842 1. 1. 0.94736842 1. 0.94736842 1. 0.94736842] mean value: 0.9736842105263157 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97368421 0.94736842 0.94736842 0.94736842 1. 0.94736842 1. 0.94736842 0.97368421 0.91812865] mean value: 0.960233918128655 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94736842 0.9047619 0.9 0.9047619 1. 0.9 1. 0.9 0.94736842 0.85714286] mean value: 0.9261403508771929 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.94 Accuracy on Blind test: 0.97 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02871037 0.04236031 0.04015183 0.07962561 0.02676129 0.04086113 0.02677369 0.02712107 0.02694225 0.02759171] mean value: 0.036689925193786624 key: score_time value: [0.02021146 0.01885295 0.01714921 0.01255703 0.01522589 0.01539779 0.01532888 0.01534462 0.01538682 0.01636577] mean value: 0.01618204116821289 key: test_mcc value: [0.47633051 0.2773501 0.53300179 0.43643578 0.10660036 0.54554473 0.73786479 0.53300179 0.63129316 0.25301653] mean value: 0.453043953428385 key: train_mcc value: [0.98823529 0.9653073 0.99413485 0.98830369 0.78190435 0.94838881 0.976741 0.95963741 0.94298132 0.88351945] mean value: 0.9429153475266533 key: test_accuracy value: [0.73684211 0.63157895 0.76315789 0.71052632 0.55263158 0.76315789 0.86842105 0.76315789 0.81081081 0.62162162] mean value: 0.7221906116642959 key: train_accuracy value: [0.99411765 0.98235294 0.99705882 0.99411765 0.87941176 0.97352941 0.98823529 0.97941176 0.97067449 0.93841642] mean value: 0.9697326203208556 key: test_fscore value: [0.75 0.5625 0.74285714 0.74418605 0.51428571 0.79069767 0.87179487 0.74285714 0.82051282 0.58823529] mean value: 0.7127926707355572 key: train_fscore value: [0.99411765 0.98203593 0.99706745 0.99408284 0.86287625 0.97280967 0.98809524 0.97897898 0.96987952 0.93416928] mean value: 0.9674112800117264 key: test_precision value: [0.71428571 0.69230769 0.8125 0.66666667 0.5625 0.70833333 0.85 0.8125 0.76190476 0.66666667] mean value: 0.7247664835164835 key: train_precision value: [0.99411765 1. 0.99415205 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9988269693842449 key: test_recall value: [0.78947368 0.47368421 0.68421053 0.84210526 0.47368421 0.89473684 0.89473684 0.68421053 0.88888889 0.52631579] mean value: 0.7152046783625731 key: train_recall value: [0.99411765 0.96470588 1. 0.98823529 0.75882353 0.94705882 0.97647059 0.95882353 0.94152047 0.87647059] mean value: 0.9406226350189199 key: test_roc_auc value: [0.73684211 0.63157895 0.76315789 0.71052632 0.55263158 0.76315789 0.86842105 0.76315789 0.8128655 0.62426901] mean value: 0.7226608187134502 key: train_roc_auc value: [0.99411765 0.98235294 0.99705882 0.99411765 0.87941176 0.97352941 0.98823529 0.97941176 0.97076023 0.93823529] mean value: 0.9697230822153423 key: test_jcc value: [0.6 0.39130435 0.59090909 0.59259259 0.34615385 0.65384615 0.77272727 0.59090909 0.69565217 0.41666667] mean value: 0.5650761235543844 key: train_jcc value: [0.98830409 0.96470588 0.99415205 0.98823529 0.75882353 0.94705882 0.97647059 0.95882353 0.94152047 0.87647059] mean value: 0.9394564843481252 MCC on Blind test: 0.42 Accuracy on Blind test: 0.71 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02983665 0.03623152 0.0326581 0.03908563 0.04000688 0.04243374 0.03620481 0.03728127 0.04367828 0.03259063] mean value: 0.03700075149536133 key: score_time value: [0.02391553 0.02225924 0.02418375 0.02630496 0.02330065 0.02267957 0.0241394 0.02215934 0.02327657 0.02062058] mean value: 0.023283958435058594 key: test_mcc value: [0.78947368 0.84327404 0.68421053 0.89473684 0.84327404 0.68803296 0.89973541 0.84327404 0.94736842 0.51461988] mean value: 0.7947999856934739 key: train_mcc value: [0.87648575 0.87648575 0.88235294 0.88241401 0.87648575 0.88825066 0.88825066 0.88825066 0.87684899 0.87121527] mean value: 0.8807040447224638 key: test_accuracy value: [0.89473684 0.92105263 0.84210526 0.94736842 0.92105263 0.84210526 0.94736842 0.92105263 0.97297297 0.75675676] mean value: 0.8966571834992887 key: train_accuracy value: [0.93823529 0.93823529 0.94117647 0.94117647 0.93823529 0.94411765 0.94411765 0.94411765 0.93841642 0.93548387] mean value: 0.9403312057961014 key: test_fscore value: [0.89473684 0.92307692 0.84210526 0.94736842 0.91891892 0.85 0.94444444 0.91891892 0.97297297 0.75675676] mean value: 0.8969299461404725 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:196: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_7030.py:199: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.93841642 0.93841642 0.94117647 0.9408284 0.93841642 0.9439528 0.9439528 0.94428152 0.93841642 0.93604651] mean value: 0.9403904203379017 key: test_precision value: [0.89473684 0.9 0.84210526 0.94736842 0.94444444 0.80952381 1. 0.94444444 0.94736842 0.77777778] mean value: 0.9007769423558897 key: train_precision value: [0.93567251 0.93567251 0.94117647 0.94642857 0.93567251 0.94674556 0.94674556 0.94152047 0.94117647 0.92528736] mean value: 0.9396098004883142 key: test_recall value: [0.89473684 0.94736842 0.84210526 0.94736842 0.89473684 0.89473684 0.89473684 0.89473684 1. 0.73684211] mean value: 0.8947368421052632 key: train_recall value: [0.94117647 0.94117647 0.94117647 0.93529412 0.94117647 0.94117647 0.94117647 0.94705882 0.93567251 0.94705882] mean value: 0.9412143102855177 key: test_roc_auc value: [0.89473684 0.92105263 0.84210526 0.94736842 0.92105263 0.84210526 0.94736842 0.92105263 0.97368421 0.75730994] mean value: 0.8967836257309941 key: train_roc_auc value: [0.93823529 0.93823529 0.94117647 0.94117647 0.93823529 0.94411765 0.94411765 0.94411765 0.93842449 0.93551772] mean value: 0.9403353973168215 key: test_jcc value: [0.80952381 0.85714286 0.72727273 0.9 0.85 0.73913043 0.89473684 0.85 0.94736842 0.60869565] mean value: 0.818387074405381 key: train_jcc value: [0.8839779 0.8839779 0.88888889 0.88826816 0.8839779 0.89385475 0.89385475 0.89444444 0.8839779 0.87978142] mean value: 0.887500400993959 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.25618434 0.2671926 0.32188869 0.26167226 0.25118732 0.25854945 0.26166058 0.24911594 0.2661407 0.28530979] mean value: 0.26789016723632814 key: score_time value: [0.02451444 0.02255034 0.02055788 0.02151513 0.02196789 0.01751781 0.02237058 0.0236876 0.02393007 0.02099395] mean value: 0.02196056842803955 key: test_mcc value: [0.78947368 0.78947368 0.68421053 0.89473684 0.84327404 0.68803296 0.89973541 0.84327404 0.94736842 0.51461988] mean value: 0.7894199498433697 key: train_mcc value: [0.87648575 0.78828985 0.88235294 0.88241401 0.87648575 0.88825066 0.88825066 0.88825066 0.87684899 0.87121527] mean value: 0.8718844543680944 key: test_accuracy value: [0.89473684 0.89473684 0.84210526 0.94736842 0.92105263 0.84210526 0.94736842 0.92105263 0.97297297 0.75675676] mean value: 0.8940256045519204 key: train_accuracy value: [0.93823529 0.89411765 0.94117647 0.94117647 0.93823529 0.94411765 0.94411765 0.94411765 0.93841642 0.93548387] mean value: 0.9359194410902191 key: test_fscore value: [0.89473684 0.89473684 0.84210526 0.94736842 0.91891892 0.85 0.94444444 0.91891892 0.97297297 0.75675676] mean value: 0.8940959380433064 key: train_fscore value: [0.93841642 0.89349112 0.94117647 0.9408284 0.93841642 0.9439528 0.9439528 0.94428152 0.93841642 0.93604651] mean value: 0.9358978905351981 key: test_precision value: [0.89473684 0.89473684 0.84210526 0.94736842 0.94444444 0.80952381 1. 0.94444444 0.94736842 0.77777778] mean value: 0.9002506265664161 key: train_precision value: [0.93567251 0.89880952 0.94117647 0.94642857 0.93567251 0.94674556 0.94674556 0.94152047 0.94117647 0.92528736] mean value: 0.9359235014072783 key: test_recall value: [0.89473684 0.89473684 0.84210526 0.94736842 0.89473684 0.89473684 0.89473684 0.89473684 1. 0.73684211] mean value: 0.8894736842105263 key: train_recall value: [0.94117647 0.88823529 0.94117647 0.93529412 0.94117647 0.94117647 0.94117647 0.94705882 0.93567251 0.94705882] mean value: 0.9359201926384588 key: test_roc_auc value: [0.89473684 0.89473684 0.84210526 0.94736842 0.92105263 0.84210526 0.94736842 0.92105263 0.97368421 0.75730994] mean value: 0.8941520467836257 key: train_roc_auc value: [0.93823529 0.89411765 0.94117647 0.94117647 0.93823529 0.94411765 0.94411765 0.94411765 0.93842449 0.93551772] mean value: 0.9359236326109391 key: test_jcc value: [0.80952381 0.80952381 0.72727273 0.9 0.85 0.73913043 0.89473684 0.85 0.94736842 0.60869565] mean value: 0.8136251696434763 key: train_jcc value: [0.8839779 0.80748663 0.88888889 0.88826816 0.8839779 0.89385475 0.89385475 0.89444444 0.8839779 0.87978142] mean value: 0.8798512740403147 MCC on Blind test: 0.8 Accuracy on Blind test: 0.9