/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_cd_8020.py:548: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. from pandas import MultiIndex, Int64Index 1.22.4 1.4.1 aaindex_df contains non-numerical data Total no. of non-numerial columns: 2 Selecting numerical data only PASS: successfully selected numerical columns only for aaindex_df Now checking for NA in the remaining aaindex_cols Counting aaindex_df cols with NA ncols with NA: 4 columns Dropping these... Original ncols: 127 Revised df ncols: 123 Checking NA in revised df... PASS: cols with NA successfully dropped from aaindex_df Proceeding with combining aa_df with other features_df PASS: ncols match Expected ncols: 123 Got: 123 Total no. of columns in clean aa_df: 123 Proceeding to merge, expected nrows in merged_df: 1133 PASS: my_features_df and aa_df successfully combined nrows: 1133 ncols: 274 count of NULL values before imputation or_mychisq 339 log10_or_mychisq 339 dtype: int64 count of NULL values AFTER imputation mutationinformation 0 or_rawI 0 logorI 0 dtype: int64 PASS: OR values imputed, data ready for ML Total no. of features for aaindex: 123 No. of numerical features: 169 No. of categorical features: 7 PASS: x_features has no target variable No. of columns for x_features: 176 ------------------------------------------------------------- Successfully split data with stratification [COMPLETE data]: 80/20 Original data size: (1132, 176) Train data size: (905, 176) Test data size: (227, 176) y_train numbers: Counter({0: 661, 1: 244}) y_train ratio: 2.709016393442623 y_test_numbers: Counter({0: 166, 1: 61}) y_test ratio: 2.721311475409836 ------------------------------------------------------------- index: 0 ind: 1 Mask count check: True index: 1 ind: 2 Mask count check: True index: 2 ind: 3 Mask count check: True Original Data Counter({0: 661, 1: 244}) Data dim: (905, 176) Simple Random OverSampling Counter({0: 661, 1: 661}) (1322, 176) Simple Random UnderSampling Counter({0: 244, 1: 244}) (488, 176) Simple Combined Over and UnderSampling Counter({0: 661, 1: 661}) (1322, 176) SMOTE_NC OverSampling Counter({0: 661, 1: 661}) (1322, 176) ##################################################################### Running ML analysis [COMPLETE DATA]: 80/20 split Gene name: rpoB Drug name: rifampicin Output directory: /home/tanu/git/Data/rifampicin/output/ml/tts_cd_8020/ Sanity checks: Total input features: 176 Training data size: (905, 176) Test data size: (227, 176) Target feature numbers (training data): Counter({0: 661, 1: 244}) Target features ratio (training data: 2.709016393442623 Target feature numbers (test data): Counter({0: 166, 1: 61}) Target features ratio (test data): 2.721311475409836 ##################################################################### ================================================================ Strucutral features (n): 37 These are: Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist'] FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss'] Other struc columns: ['rsa', 'kd_values', 'rd_values'] ================================================================ AAindex features (n): 123 These are: ['ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'] ================================================================ Evolutionary features (n): 3 These are: ['consurf_score', 'snap2_score', 'provean_score'] ================================================================ Genomic features (n): 6 These are: ['maf', 'logorI'] ['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'] ================================================================ Categorical features (n): 7 These are: ['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'] ================================================================ Pass: No. of features match ##################################################################### Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.04789352 0.06307125 0.0481348 0.10236144 0.18239737 0.0426147 0.10045457 0.15550613 0.0969727 0.05134344] mean value: 0.08907499313354492 key: score_time value: [0.01345658 0.01279902 0.01375747 0.01285005 0.01630592 0.02184248 0.01292872 0.01723862 0.01306796 0.02284384] mean value: 0.015709066390991212 key: test_mcc value: [0.63725369 0.57246685 0.55878788 0.54842288 0.51496742 0.54545455 0.51508188 0.65091508 0.70352647 0.59215653] mean value: 0.5839033214050867 key: train_mcc value: [0.71248167 0.70245135 0.68901123 0.72431999 0.703704 0.71767136 0.72242976 0.69790312 0.69642243 0.69924227] mean value: 0.7065637192314627 key: test_accuracy value: [0.85714286 0.83516484 0.82417582 0.82417582 0.81318681 0.82222222 0.82222222 0.86666667 0.88888889 0.83333333] mean value: 0.8387179487179487 key: train_accuracy value: [0.88820639 0.88452088 0.87960688 0.89312039 0.88329238 0.89079755 0.89202454 0.88220859 0.88220859 0.88343558] mean value: 0.8859421775372696 key: test_fscore value: [0.73469388 0.68085106 0.68 0.66666667 0.63829787 0.66666667 0.61904762 0.73913043 0.76190476 0.70588235] mean value: 0.6893141315730733 key: train_fscore value: [0.78787879 0.78037383 0.76995305 0.79625293 0.78359909 0.79058824 0.79534884 0.77777778 0.77570093 0.77751756] mean value: 0.7834991036799865 key: test_precision value: [0.72 0.72727273 0.68 0.69565217 0.68181818 0.66666667 0.72222222 0.77272727 0.88888889 0.66666667] mean value: 0.722191480017567 key: train_precision value: [0.80861244 0.79904306 0.79227053 0.81730769 0.78181818 0.8195122 0.81428571 0.79245283 0.79807692 0.80193237] mean value: 0.8025311937742211 key: test_recall value: [0.75 0.64 0.68 0.64 0.6 0.66666667 0.54166667 0.70833333 0.66666667 0.75 ] mean value: 0.6643333333333333 key: train_recall value: [0.76818182 0.76255708 0.74885845 0.77625571 0.78538813 0.76363636 0.77727273 0.76363636 0.75454545 0.75454545] mean value: 0.7654877542548775 key: test_roc_auc value: [0.82276119 0.77454545 0.77939394 0.7669697 0.7469697 0.77272727 0.73295455 0.81628788 0.81818182 0.80681818] mean value: 0.7837609678878336 key: train_roc_auc value: [0.85042088 0.84598442 0.83829477 0.85619508 0.85235793 0.85072574 0.85586325 0.84484339 0.84197861 0.84281895] mean value: 0.8479483023318639 key: test_jcc value: [0.58064516 0.51612903 0.51515152 0.5 0.46875 0.5 0.44827586 0.5862069 0.61538462 0.54545455] mean value: 0.5275997628159753 key: train_jcc value: [0.65 0.63984674 0.6259542 0.6614786 0.64419476 0.6536965 0.66023166 0.63636364 0.63358779 0.63601533] mean value: 0.644136920412421 MCC on Blind test: 0.61 Accuracy on Blind test: 0.85 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [2.18454385 3.19450092 3.09683037 1.91522408 1.2263341 1.21541953 1.31338263 1.00562215 1.47615552 1.02071047] mean value: 1.7648723602294922 key: score_time value: [0.02524209 0.05973387 0.05177426 0.01649189 0.02263284 0.01362753 0.01753664 0.0148828 0.02008581 0.01392865] mean value: 0.02559363842010498 key: test_mcc value: [0.62759962 0.6529561 0.67809672 0.54842288 0.44620513 0.51076836 0.45226702 0.67722905 0.70509509 0.61348661] mean value: 0.5912126585754784 key: train_mcc value: [0.83425771 0.80351061 0.82034758 0.82665973 0.77386796 0.81174107 0.76434069 0.73636649 0.78610731 0.75931692] mean value: 0.7916516091901614 key: test_accuracy value: [0.84615385 0.86813187 0.86813187 0.82417582 0.79120879 0.81111111 0.8 0.87777778 0.88888889 0.84444444] mean value: 0.842002442002442 key: train_accuracy value: [0.93488943 0.92260442 0.92997543 0.93243243 0.91154791 0.92638037 0.90797546 0.89693252 0.91656442 0.90674847] mean value: 0.9186050858443496 key: test_fscore value: [0.73076923 0.72727273 0.76923077 0.66666667 0.57777778 0.63829787 0.57142857 0.75555556 0.77272727 0.72 ] mean value: 0.6929726443768996 key: train_fscore value: [0.87871854 0.85649203 0.86774942 0.87238979 0.83410138 0.86175115 0.82678984 0.80645161 0.84259259 0.82159624] mean value: 0.8468632596467518 key: test_precision value: [0.67857143 0.84210526 0.74074074 0.69565217 0.65 0.65217391 0.66666667 0.80952381 0.85 0.69230769] mean value: 0.7277741687924755 key: train_precision value: [0.88479263 0.85454545 0.88207547 0.88679245 0.84186047 0.87383178 0.84037559 0.81775701 0.85849057 0.84951456] mean value: 0.8590035971963867 key: test_recall value: [0.79166667 0.64 0.8 0.64 0.52 0.625 0.5 0.70833333 0.70833333 0.75 ] mean value: 0.6683333333333333 key: train_recall value: [0.87272727 0.85844749 0.85388128 0.85844749 0.82648402 0.85 0.81363636 0.79545455 0.82727273 0.79545455] mean value: 0.8351805728518057 key: test_roc_auc value: [0.82866915 0.79727273 0.8469697 0.7669697 0.7069697 0.75189394 0.70454545 0.82386364 0.83143939 0.81439394] mean value: 0.7872987336047038 key: train_roc_auc value: [0.91531987 0.90233299 0.90593224 0.90905568 0.88467058 0.90231092 0.87824675 0.86495416 0.88842628 0.87167685] mean value: 0.8922926320106014 key: test_jcc value: [0.57575758 0.57142857 0.625 0.5 0.40625 0.46875 0.4 0.60714286 0.62962963 0.5625 ] mean value: 0.5346458633958634 key: train_jcc value: [0.78367347 0.74900398 0.76639344 0.77366255 0.71541502 0.75708502 0.70472441 0.67567568 0.728 0.69721116] mean value: 0.7350844728023521 MCC on Blind test: 0.61 Accuracy on Blind test: 0.85 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.0187583 0.01387739 0.01324153 0.01288629 0.01322436 0.0131669 0.01319027 0.01312518 0.01313162 0.01309538] mean value: 0.013769721984863282 key: score_time value: [0.01342607 0.01070094 0.0104444 0.0103941 0.01025105 0.01064491 0.01033258 0.0104003 0.01012063 0.01041937] mean value: 0.010713434219360352 key: test_mcc value: [0.46431857 0.34257106 0.3547423 0.44066378 0.51102889 0.44718 0.46296162 0.44441468 0.4674735 0.46313625] mean value: 0.43984906387328504 key: train_mcc value: [0.49764726 0.46577226 0.48255617 0.46105197 0.4887123 0.48766662 0.46709067 0.46459558 0.51438731 0.45890879] mean value: 0.4788388923833725 key: test_accuracy value: [0.76923077 0.69230769 0.73626374 0.74725275 0.79120879 0.77777778 0.75555556 0.75555556 0.78888889 0.77777778] mean value: 0.7591819291819292 key: train_accuracy value: [0.79115479 0.74078624 0.77886978 0.77027027 0.78869779 0.78650307 0.77546012 0.77177914 0.79509202 0.77055215] mean value: 0.7769165372846354 key: test_fscore value: [0.61818182 0.5483871 0.53846154 0.61016949 0.65454545 0.6 0.62068966 0.60714286 0.6122449 0.61538462] mean value: 0.6025207425147499 key: train_fscore value: [0.64135021 0.62388592 0.63265306 0.61758691 0.63404255 0.63445378 0.62111801 0.62040816 0.65424431 0.61601643] mean value: 0.6295759346178662 key: test_precision value: [0.5483871 0.45945946 0.51851852 0.52941176 0.6 0.57692308 0.52941176 0.53125 0.6 0.57142857] mean value: 0.5464790252515584 key: train_precision value: [0.5984252 0.51169591 0.57195572 0.55925926 0.5936255 0.58984375 0.57034221 0.56296296 0.60076046 0.56179775] mean value: 0.5720668707476475 key: test_recall value: [0.70833333 0.68 0.56 0.72 0.72 0.625 0.75 0.70833333 0.625 0.66666667] mean value: 0.6763333333333333 key: train_recall value: [0.69090909 0.79908676 0.70776256 0.68949772 0.6803653 0.68636364 0.68181818 0.69090909 0.71818182 0.68181818] mean value: 0.7026712328767123 key: test_roc_auc value: [0.74968905 0.68848485 0.68151515 0.73878788 0.76909091 0.72916667 0.75378788 0.7405303 0.73674242 0.74242424] mean value: 0.7330219357756671 key: train_roc_auc value: [0.75959596 0.75920724 0.75640229 0.74474886 0.75446836 0.75494652 0.74595111 0.74629488 0.77085561 0.74258976] mean value: 0.753506060373506 key: test_jcc value: [0.44736842 0.37777778 0.36842105 0.43902439 0.48648649 0.42857143 0.45 0.43589744 0.44117647 0.44444444] mean value: 0.43191679076939216 key: train_jcc value: [0.47204969 0.45336788 0.46268657 0.44674556 0.46417445 0.46461538 0.45045045 0.44970414 0.48615385 0.44510386] mean value: 0.459505183000996 MCC on Blind test: 0.43 Accuracy on Blind test: 0.77 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01282287 0.01630855 0.01634121 0.01639724 0.01683545 0.01645255 0.01656127 0.01641417 0.01643634 0.01640034] mean value: 0.016096997261047363 key: score_time value: [0.01217628 0.01253057 0.01262355 0.01262665 0.01264167 0.01256514 0.01254058 0.01260877 0.01280284 0.01267886] mean value: 0.01257948875427246 key: test_mcc value: [0.52551942 0.60507041 0.45746799 0.30563727 0.39333333 0.30012252 0.40291148 0.64071161 0.4755188 0.44718 ] mean value: 0.45534728443319816 key: train_mcc value: [0.53038838 0.50974707 0.52491489 0.53390093 0.54820336 0.54668176 0.49997774 0.50534121 0.55293504 0.52295749] mean value: 0.5275047871297697 key: test_accuracy value: [0.81318681 0.84615385 0.79120879 0.73626374 0.75824176 0.73333333 0.77777778 0.86666667 0.8 0.77777778] mean value: 0.7900610500610501 key: train_accuracy value: [0.81449631 0.81326781 0.81695332 0.82063882 0.82678133 0.82208589 0.80981595 0.80613497 0.82453988 0.81717791] mean value: 0.8171892193364586 key: test_fscore value: [0.65306122 0.70833333 0.59574468 0.47826087 0.56 0.47826087 0.54545455 0.71428571 0.60869565 0.6 ] mean value: 0.5942096889718801 key: train_fscore value: [0.65759637 0.63285024 0.64775414 0.65402844 0.66348449 0.66819222 0.62469734 0.63761468 0.67276888 0.64439141] mean value: 0.6503378195409838 key: test_precision value: [0.64 0.73913043 0.63636364 0.52380952 0.56 0.5 0.6 0.83333333 0.63636364 0.57692308] mean value: 0.6245923641575816 key: train_precision value: [0.6561086 0.67179487 0.67156863 0.67980296 0.695 0.67281106 0.66839378 0.64351852 0.67741935 0.67839196] mean value: 0.6714809727643422 key: test_recall value: [0.66666667 0.68 0.56 0.44 0.56 0.45833333 0.5 0.625 0.58333333 0.625 ] mean value: 0.5698333333333333 key: train_recall value: [0.65909091 0.59817352 0.62557078 0.63013699 0.6347032 0.66363636 0.58636364 0.63181818 0.66818182 0.61363636] mean value: 0.6311311747613118 key: test_roc_auc value: [0.76616915 0.79454545 0.71939394 0.64424242 0.69666667 0.64583333 0.68939394 0.78977273 0.73106061 0.72916667] mean value: 0.7206244911804613 key: train_roc_auc value: [0.76557239 0.74530525 0.75648287 0.76044664 0.76609109 0.77215432 0.73940031 0.75120321 0.77526738 0.75303667] mean value: 0.7584960120757864 key: test_jcc value: [0.48484848 0.5483871 0.42424242 0.31428571 0.38888889 0.31428571 0.375 0.55555556 0.4375 0.42857143] mean value: 0.4271565307452404 key: train_jcc value: [0.48986486 0.46289753 0.47902098 0.48591549 0.49642857 0.50171821 0.45422535 0.46801347 0.50689655 0.47535211] mean value: 0.4820333132358686 MCC on Blind test: 0.48 Accuracy on Blind test: 0.81 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01524544 0.01264095 0.01103282 0.01104927 0.01092625 0.01289582 0.01252794 0.01093578 0.01066732 0.01054311] mean value: 0.011846470832824706 key: score_time value: [0.10600185 0.01958251 0.02010822 0.0199244 0.01993418 0.02180696 0.0200088 0.02038217 0.02008438 0.01971579] mean value: 0.028754925727844237 key: test_mcc value: [0.4359729 0.2248982 0.25060326 0.15668154 0.32373419 0.24534987 0.40451992 0.4792982 0.35478744 0.35478744] mean value: 0.32306329498533326 key: train_mcc value: [0.55031 0.57866175 0.58609779 0.59710057 0.59745376 0.59928007 0.57365543 0.57455254 0.57119357 0.59242769] mean value: 0.5820733171911701 key: test_accuracy value: [0.8021978 0.73626374 0.72527473 0.72527473 0.75824176 0.73333333 0.78888889 0.81111111 0.77777778 0.77777778] mean value: 0.7636141636141636 key: train_accuracy value: [0.83415233 0.84520885 0.84766585 0.85135135 0.85135135 0.85153374 0.84294479 0.84294479 0.84171779 0.84907975] mean value: 0.8457950588625435 key: test_fscore value: [0.52631579 0.33333333 0.41860465 0.24242424 0.45 0.4 0.51282051 0.58536585 0.44444444 0.44444444] mean value: 0.4357753271761989 key: train_fscore value: [0.64 0.64804469 0.65555556 0.66481994 0.67029973 0.67385445 0.65027322 0.65591398 0.6541555 0.67024129] mean value: 0.6583158353231275 key: test_precision value: [0.71428571 0.54545455 0.5 0.5 0.6 0.5 0.66666667 0.70588235 0.66666667 0.66666667] mean value: 0.6065622612681436 key: train_precision value: [0.77419355 0.83453237 0.83687943 0.84507042 0.83108108 0.82781457 0.81506849 0.80263158 0.79738562 0.81699346] mean value: 0.8181650585330019 key: test_recall value: [0.41666667 0.24 0.36 0.16 0.36 0.33333333 0.41666667 0.5 0.33333333 0.33333333] mean value: 0.3453333333333333 key: train_recall value: [0.54545455 0.52968037 0.53881279 0.54794521 0.56164384 0.56818182 0.54090909 0.55454545 0.55454545 0.56818182] mean value: 0.5509900373599004 key: test_roc_auc value: [0.67848259 0.58212121 0.61181818 0.54969697 0.63454545 0.60606061 0.67045455 0.71212121 0.63636364 0.63636364] mean value: 0.6318028041610131 key: train_roc_auc value: [0.74326599 0.74551245 0.75007866 0.75548521 0.75981351 0.76224217 0.74776547 0.75206264 0.75122231 0.7605615 ] mean value: 0.7528009915741585 key: test_jcc value: [0.35714286 0.2 0.26470588 0.13793103 0.29032258 0.25 0.34482759 0.4137931 0.28571429 0.28571429] mean value: 0.2830151615707462 key: train_jcc value: [0.47058824 0.47933884 0.48760331 0.49792531 0.50409836 0.50813008 0.48178138 0.488 0.48605578 0.50403226] mean value: 0.49075535486894833 MCC on Blind test: 0.38 Accuracy on Blind test: 0.78 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.04472566 0.03670692 0.03695655 0.03710032 0.03720188 0.03994393 0.04123163 0.04074073 0.03910685 0.03895974] mean value: 0.03926742076873779 key: score_time value: [0.01724148 0.01627874 0.0164144 0.01637268 0.01719189 0.01736212 0.01677203 0.01638651 0.01843786 0.01628327] mean value: 0.016874098777770997 key: test_mcc value: [0.56001732 0.51496742 0.63725369 0.50565559 0.4351278 0.38019877 0.55001241 0.59244966 0.60677988 0.54545455] mean value: 0.532791707378351 key: train_mcc value: [0.66223433 0.67434671 0.6577144 0.68087197 0.68545586 0.66431904 0.66727313 0.65537963 0.66901689 0.67744119] mean value: 0.6694053147134513 key: test_accuracy value: [0.83516484 0.81318681 0.85714286 0.81318681 0.78021978 0.76666667 0.83333333 0.84444444 0.85555556 0.82222222] mean value: 0.8221123321123321 key: train_accuracy value: [0.86977887 0.87592138 0.86977887 0.87837838 0.87837838 0.87239264 0.86993865 0.86871166 0.87361963 0.87730061] mean value: 0.8734199062419921 key: test_fscore value: [0.66666667 0.63829787 0.73469388 0.62222222 0.58333333 0.53333333 0.65116279 0.69565217 0.66666667 0.66666667] mean value: 0.6458695603391053 key: train_fscore value: [0.74881517 0.75425791 0.74146341 0.75912409 0.76705882 0.74509804 0.75576037 0.73965937 0.75060533 0.75490196] mean value: 0.7516744462110857 key: test_precision value: [0.71428571 0.68181818 0.75 0.7 0.60869565 0.57142857 0.73684211 0.72727273 0.86666667 0.66666667] mean value: 0.7023676285575599 key: train_precision value: [0.78217822 0.80729167 0.79581152 0.8125 0.79126214 0.80851064 0.76635514 0.79581152 0.80310881 0.81914894] mean value: 0.798197858000515 key: test_recall value: [0.625 0.6 0.72 0.56 0.56 0.5 0.58333333 0.66666667 0.54166667 0.66666667] mean value: 0.6023333333333334 key: train_recall value: [0.71818182 0.70776256 0.69406393 0.71232877 0.74429224 0.69090909 0.74545455 0.69090909 0.70454545 0.7 ] mean value: 0.7108447488584475 key: test_roc_auc value: [0.76772388 0.7469697 0.81454545 0.73454545 0.71181818 0.68181818 0.75378788 0.78787879 0.75568182 0.77272727] mean value: 0.7527496607869741 key: train_roc_auc value: [0.82205387 0.82278884 0.81425885 0.82591228 0.83601166 0.81520244 0.83071047 0.81268144 0.82033995 0.82142857] mean value: 0.822138838792747 key: test_jcc value: [0.5 0.46875 0.58064516 0.4516129 0.41176471 0.36363636 0.48275862 0.53333333 0.5 0.5 ] mean value: 0.4792501088057834 key: train_jcc value: [0.59848485 0.60546875 0.58914729 0.61176471 0.6221374 0.59375 0.60740741 0.58687259 0.60077519 0.60629921] mean value: 0.6022107396445928 MCC on Blind test: 0.49 Accuracy on Blind test: 0.81 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.22346926 5.16487861 5.78156376 3.83858371 1.55056047 1.29423809 0.99304223 2.79679513 3.26674962 1.57767797] mean value: 2.7487558841705324 key: score_time value: [0.013201 0.02823043 0.02088594 0.02056909 0.02059984 0.0205822 0.02068686 0.02088261 0.02082276 0.02058458] mean value: 0.0207045316696167 key: test_mcc value: [0.64850123 0.49781081 0.62455652 0.57031192 0.57031192 0.44250602 0.50822474 0.57789674 0.73663511 0.63960215] mean value: 0.5816357147239263 key: train_mcc value: [0.66764601 0.66977539 0.74152285 0.80197555 0.70737076 0.53141196 0.58027983 0.70480057 0.74133067 0.67564324] mean value: 0.6821756844169496 key: test_accuracy value: [0.85714286 0.81318681 0.85714286 0.82417582 0.82417582 0.8 0.82222222 0.84444444 0.9 0.84444444] mean value: 0.8386935286935286 key: train_accuracy value: [0.86732187 0.87592138 0.9017199 0.91891892 0.88083538 0.82944785 0.84294479 0.88834356 0.89079755 0.86503067] mean value: 0.8761281861895359 key: test_fscore value: [0.74509804 0.60465116 0.71111111 0.69230769 0.69230769 0.55 0.6 0.66666667 0.8 0.74074074] mean value: 0.6802883105140287 key: train_fscore value: [0.75892857 0.72176309 0.79166667 0.85652174 0.78867102 0.6017192 0.6751269 0.76606684 0.81341719 0.76694915] mean value: 0.7540830369215625 key: test_precision value: [0.7037037 0.72222222 0.8 0.66666667 0.66666667 0.6875 0.75 0.77777778 0.85714286 0.66666667] mean value: 0.729834656084656 key: train_precision value: [0.74561404 0.90972222 0.92121212 0.81742739 0.75416667 0.81395349 0.76436782 0.8816568 0.75486381 0.71825397] mean value: 0.8081238321762161 key: test_recall value: [0.79166667 0.52 0.64 0.72 0.72 0.45833333 0.5 0.58333333 0.75 0.83333333] mean value: 0.6516666666666666 key: train_recall value: [0.77272727 0.59817352 0.69406393 0.89954338 0.82648402 0.47727273 0.60454545 0.67727273 0.88181818 0.82272727] mean value: 0.7254628476546284 key: test_roc_auc value: [0.83613184 0.72212121 0.78969697 0.79181818 0.79181818 0.69128788 0.71969697 0.76136364 0.85227273 0.84090909] mean value: 0.7797116689280869 key: train_roc_auc value: [0.83754209 0.78816239 0.83610759 0.9127969 0.86366218 0.7184683 0.76781895 0.82182964 0.88796791 0.85169977] mean value: 0.8286055714661678 key: test_jcc value: [0.59375 0.43333333 0.55172414 0.52941176 0.52941176 0.37931034 0.42857143 0.5 0.66666667 0.58823529] mean value: 0.5200414734859461 key: train_jcc value: [0.61151079 0.56465517 0.65517241 0.74904943 0.65107914 0.43032787 0.50957854 0.62083333 0.68551237 0.62199313] mean value: 0.6099712184808272 MCC on Blind test: 0.58 Accuracy on Blind test: 0.84 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.06455588 0.04757524 0.05126166 0.04958081 0.07174301 0.09911942 0.0670929 0.05137992 0.04852629 0.05046058] mean value: 0.06012957096099854 key: score_time value: [0.01296258 0.01292348 0.0129292 0.01287627 0.02691889 0.02911472 0.01300859 0.03288078 0.01305461 0.01302671] mean value: 0.017969584465026854 key: test_mcc value: [0.73758379 0.5916592 0.67809672 0.76472717 0.71465185 0.61348661 0.67187336 0.73450514 0.67722905 0.69290233] mean value: 0.6876715224801788 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9010989 0.83516484 0.86813187 0.89010989 0.89010989 0.84444444 0.87777778 0.88888889 0.87777778 0.86666667] mean value: 0.874017094017094 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8 0.70588235 0.76923077 0.82758621 0.7826087 0.72 0.73170732 0.80769231 0.75555556 0.77777778] mean value: 0.7678040982819483 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85714286 0.69230769 0.74074074 0.72727273 0.85714286 0.69230769 0.88235294 0.75 0.80952381 0.7 ] mean value: 0.7708791317614847 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.75 0.72 0.8 0.96 0.72 0.75 0.625 0.875 0.70833333 0.875 ] mean value: 0.7783333333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85261194 0.79939394 0.8469697 0.91181818 0.83727273 0.81439394 0.79734848 0.8844697 0.82386364 0.86931818] mean value: 0.8437460425146992 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.66666667 0.54545455 0.625 0.70588235 0.64285714 0.5625 0.57692308 0.67741935 0.60714286 0.63636364] mean value: 0.6246209633187811 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.7 Accuracy on Blind test: 0.88 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.34010911 0.24522424 0.31502557 0.2490201 0.29990029 0.24824834 0.33211732 0.20768809 0.17433023 0.21552944] mean value: 0.2627192735671997 key: score_time value: [0.02708316 0.0260675 0.02612591 0.02618861 0.02614689 0.02590108 0.02695704 0.0190289 0.02099538 0.02617288] mean value: 0.025066733360290527 key: test_mcc value: [0.59397623 0.54842288 0.48092924 0.53935989 0.59779054 0.49901088 0.51508188 0.67314951 0.67419986 0.52378493] mean value: 0.5645705839858854 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.84615385 0.82417582 0.8021978 0.82417582 0.84615385 0.81111111 0.82222222 0.87777778 0.87777778 0.82222222] mean value: 0.8353968253968254 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.69565217 0.66666667 0.60869565 0.65217391 0.69565217 0.62222222 0.61904762 0.74418605 0.71794872 0.63636364] mean value: 0.6658608821803969 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.72727273 0.69565217 0.66666667 0.71428571 0.76190476 0.66666667 0.72222222 0.84210526 0.93333333 0.7 ] mean value: 0.743010952942303 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 0.64 0.56 0.6 0.64 0.58333333 0.54166667 0.66666667 0.58333333 0.58333333] mean value: 0.6065 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.78855721 0.7669697 0.7269697 0.75454545 0.78212121 0.73863636 0.73295455 0.81060606 0.78409091 0.74621212] mean value: 0.7631663274536409 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.53333333 0.5 0.4375 0.48387097 0.53333333 0.4516129 0.44827586 0.59259259 0.56 0.46666667] mean value: 0.5007185658962633 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.51 Accuracy on Blind test: 0.82 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01538968 0.0140624 0.01389027 0.01316977 0.01276588 0.01181865 0.01234031 0.01337767 0.01329684 0.01371527] mean value: 0.013382673263549805 key: score_time value: [0.01168752 0.01035738 0.01049376 0.0119102 0.01032829 0.00932217 0.01101422 0.00986791 0.00957704 0.01047921] mean value: 0.010503768920898438 key: test_mcc value: [0.30340909 0.38675467 0.44066378 0.50565559 0.2830303 0.11903254 0.26967994 0.51741002 0.26382243 0.44718 ] mean value: 0.3536638368258284 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.71428571 0.76923077 0.74725275 0.81318681 0.71428571 0.67777778 0.74444444 0.8 0.73333333 0.77777778] mean value: 0.7491575091575091 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.5 0.53333333 0.61016949 0.62222222 0.48 0.3255814 0.41025641 0.65384615 0.42857143 0.6 ] mean value: 0.5163980435103809 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.46428571 0.6 0.52941176 0.7 0.48 0.36842105 0.53333333 0.60714286 0.5 0.57692308] mean value: 0.5359517799022443 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.54166667 0.48 0.72 0.56 0.48 0.29166667 0.33333333 0.70833333 0.375 0.625 ] mean value: 0.5115 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.65889303 0.67939394 0.73878788 0.73454545 0.64151515 0.55492424 0.61363636 0.77083333 0.61931818 0.72916667] mean value: 0.6741014246947082 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.33333333 0.36363636 0.43902439 0.4516129 0.31578947 0.19444444 0.25806452 0.48571429 0.27272727 0.42857143] mean value: 0.354291841171008 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.33 Accuracy on Blind test: 0.75 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [2.64140868 3.81011534 3.99258542 4.03306317 4.11460781 4.15739083 4.4136889 3.99358988 4.03670621 3.72725749] mean value: 3.8920413732528685 key: score_time value: [0.10188866 0.2263751 0.13317132 0.13334799 0.1331985 0.13286519 0.13206124 0.13259816 0.13317728 0.10428119] mean value: 0.13629646301269532 key: test_mcc value: [0.74218994 0.62420432 0.83151316 0.62455652 0.80485509 0.57966713 0.61158096 0.77272727 0.79628662 0.71590909] mean value: 0.7103490118151145 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9010989 0.84615385 0.93406593 0.85714286 0.92307692 0.83333333 0.85555556 0.91111111 0.92222222 0.88888889] mean value: 0.8872649572649572 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.80851064 0.73076923 0.86956522 0.71111111 0.85714286 0.69387755 0.69767442 0.83333333 0.8372093 0.79166667] mean value: 0.7830860326663017 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.82608696 0.7037037 0.95238095 0.8 0.875 0.68 0.78947368 0.83333333 0.94736842 0.79166667] mean value: 0.8199013717869553 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.79166667 0.76 0.8 0.64 0.84 0.70833333 0.625 0.83333333 0.75 0.79166667] mean value: 0.754 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.86598259 0.81939394 0.89242424 0.78969697 0.89727273 0.79356061 0.78219697 0.88636364 0.86742424 0.85795455] mean value: 0.8452270465852556 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.67857143 0.57575758 0.76923077 0.55172414 0.75 0.53125 0.53571429 0.71428571 0.72 0.65517241] mean value: 0.6481706325283911 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.73 Accuracy on Blind test: 0.89 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: fit_time value: [2.11717629 1.22336912 1.25936317 1.12006736 1.14099693 1.24300289 1.19389844 1.1325438 1.14318395 1.1529448 ] mean value: 1.2726546764373778 key: score_time value: [0.22170925 0.1627214 0.2340374 0.13864779 0.16498137 0.19844413 0.16833639 0.17471385 0.2893374 0.24061823] mean value: 0.19935472011566163 key: test_mcc value: [0.74218994 0.69312083 0.83151316 0.62455652 0.71465185 0.56837381 0.61158096 0.79879562 0.73606509 0.61782299] mean value: 0.6938670782096882 key: train_mcc value: [0.9214512 0.94658694 0.92441313 0.93079515 0.9275678 0.92467251 0.93417988 0.93414825 0.92467251 0.92788451] mean value: 0.9296371873548671 key: test_accuracy value: [0.9010989 0.87912088 0.93406593 0.85714286 0.89010989 0.83333333 0.85555556 0.92222222 0.9 0.85555556] mean value: 0.8828205128205128 key: train_accuracy value: [0.96928747 0.97911548 0.97051597 0.97297297 0.97174447 0.97055215 0.97423313 0.97423313 0.97055215 0.97177914] mean value: 0.9724986056887898 key: test_fscore value: [0.80851064 0.7755102 0.86956522 0.71111111 0.7826087 0.68085106 0.69767442 0.85106383 0.7804878 0.71111111] mean value: 0.7668494094744926 key: train_fscore value: [0.94172494 0.96037296 0.94339623 0.94811321 0.94613583 0.94366197 0.95081967 0.9512761 0.94366197 0.94588235] mean value: 0.9475045238264362 key: test_precision value: [0.82608696 0.79166667 0.95238095 0.8 0.85714286 0.69565217 0.78947368 0.86956522 0.94117647 0.76190476] mean value: 0.8285049740720086 key: train_precision value: [0.96650718 0.98095238 0.97560976 0.9804878 0.97115385 0.97572816 0.98067633 0.97156398 0.97572816 0.9804878 ] mean value: 0.975889539021806 key: test_recall value: [0.79166667 0.76 0.8 0.64 0.72 0.66666667 0.625 0.83333333 0.66666667 0.66666667] mean value: 0.717 key: train_recall value: [0.91818182 0.94063927 0.91324201 0.91780822 0.92237443 0.91363636 0.92272727 0.93181818 0.91363636 0.91363636] mean value: 0.9207700290577003 key: test_roc_auc value: [0.86598259 0.84212121 0.89242424 0.78969697 0.83727273 0.78030303 0.78219697 0.89393939 0.82575758 0.79545455] mean value: 0.8305149253731343 key: train_roc_auc value: [0.95319865 0.96695829 0.95241932 0.95554277 0.9561452 0.9526165 0.95800229 0.96086707 0.9526165 0.95345684] mean value: 0.9561823435614732 key: test_jcc value: [0.67857143 0.63333333 0.76923077 0.55172414 0.64285714 0.51612903 0.53571429 0.74074074 0.64 0.55172414] mean value: 0.6260025008567834 key: train_jcc value: [0.88986784 0.92376682 0.89285714 0.90134529 0.89777778 0.89333333 0.90625 0.90707965 0.89333333 0.89732143] mean value: 0.9002932610923725 MCC on Blind test: 0.69 Accuracy on Blind test: 0.88 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02068615 0.04053044 0.04051805 0.04072928 0.04056287 0.01637459 0.01640034 0.01641941 0.01644111 0.02096176] mean value: 0.02696239948272705 key: score_time value: [0.01229119 0.02326393 0.02206826 0.02382994 0.01262593 0.01253796 0.01262951 0.01260781 0.01265311 0.01269269] mean value: 0.015720033645629884 key: test_mcc value: [0.52551942 0.60507041 0.45746799 0.30563727 0.39333333 0.30012252 0.40291148 0.64071161 0.4755188 0.44718 ] mean value: 0.45534728443319816 key: train_mcc value: [0.53038838 0.50974707 0.52491489 0.53390093 0.54820336 0.54668176 0.49997774 0.50534121 0.55293504 0.52295749] mean value: 0.5275047871297697 key: test_accuracy value: [0.81318681 0.84615385 0.79120879 0.73626374 0.75824176 0.73333333 0.77777778 0.86666667 0.8 0.77777778] mean value: 0.7900610500610501 key: train_accuracy value: [0.81449631 0.81326781 0.81695332 0.82063882 0.82678133 0.82208589 0.80981595 0.80613497 0.82453988 0.81717791] mean value: 0.8171892193364586 key: test_fscore value: [0.65306122 0.70833333 0.59574468 0.47826087 0.56 0.47826087 0.54545455 0.71428571 0.60869565 0.6 ] mean value: 0.5942096889718801 key: train_fscore value: [0.65759637 0.63285024 0.64775414 0.65402844 0.66348449 0.66819222 0.62469734 0.63761468 0.67276888 0.64439141] mean value: 0.6503378195409838 key: test_precision value: [0.64 0.73913043 0.63636364 0.52380952 0.56 0.5 0.6 0.83333333 0.63636364 0.57692308] mean value: 0.6245923641575816 key: train_precision value: [0.6561086 0.67179487 0.67156863 0.67980296 0.695 0.67281106 0.66839378 0.64351852 0.67741935 0.67839196] mean value: 0.6714809727643422 key: test_recall value: [0.66666667 0.68 0.56 0.44 0.56 0.45833333 0.5 0.625 0.58333333 0.625 ] mean value: 0.5698333333333333 key: train_recall value: [0.65909091 0.59817352 0.62557078 0.63013699 0.6347032 0.66363636 0.58636364 0.63181818 0.66818182 0.61363636] mean value: 0.6311311747613118 key: test_roc_auc value: [0.76616915 0.79454545 0.71939394 0.64424242 0.69666667 0.64583333 0.68939394 0.78977273 0.73106061 0.72916667] mean value: 0.7206244911804613 key: train_roc_auc value: [0.76557239 0.74530525 0.75648287 0.76044664 0.76609109 0.77215432 0.73940031 0.75120321 0.77526738 0.75303667] mean value: 0.7584960120757864 key: test_jcc value: [0.48484848 0.5483871 0.42424242 0.31428571 0.38888889 0.31428571 0.375 0.55555556 0.4375 0.42857143] mean value: 0.4271565307452404 key: train_jcc value: [0.48986486 0.46289753 0.47902098 0.48591549 0.49642857 0.50171821 0.45422535 0.46801347 0.50689655 0.47535211] mean value: 0.4820333132358686 MCC on Blind test: 0.48 Accuracy on Blind test: 0.81 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [2.75444412 2.56730151 2.74305201 2.68936348 2.72433519 2.71856141 2.70131421 2.74700999 2.77551627 2.747895 ] mean value: 2.7168793201446535 key: score_time value: [0.01431656 0.0132854 0.01292968 0.0139606 0.0135746 0.01367378 0.01520157 0.01331019 0.01399136 0.01370096] mean value: 0.013794469833374023 key: test_mcc value: [0.81227628 0.53716427 0.88969697 0.78588153 0.77501303 0.71328456 0.73471806 0.83522876 0.79604116 0.80405441] mean value: 0.7683359042790965 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.92307692 0.81318681 0.95604396 0.91208791 0.91208791 0.87777778 0.9 0.93333333 0.92222222 0.92222222] mean value: 0.9072039072039072 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8627451 0.66666667 0.92 0.84615385 0.83333333 0.79245283 0.79069767 0.88 0.84444444 0.85714286] mean value: 0.8293636750387647 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.81481481 0.65384615 0.92 0.81481481 0.86956522 0.72413793 0.89473684 0.84615385 0.9047619 0.84 ] mean value: 0.8282831524922585 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.91666667 0.68 0.92 0.88 0.8 0.875 0.70833333 0.91666667 0.79166667 0.875 ] mean value: 0.8363333333333334 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9210199 0.77181818 0.94484848 0.90212121 0.87727273 0.87689394 0.83901515 0.9280303 0.88068182 0.90719697] mean value: 0.88488986883763 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.75862069 0.5 0.85185185 0.73333333 0.71428571 0.65625 0.65384615 0.78571429 0.73076923 0.75 ] mean value: 0.7134671259455743 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.9 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.06290793 0.09113836 0.09762669 0.08045506 0.1088562 0.09835911 0.10149932 0.09140635 0.09574938 0.07322192] mean value: 0.09012203216552735 key: score_time value: [0.03614759 0.01270986 0.02036381 0.02011633 0.02052951 0.02215934 0.02006292 0.01273179 0.02016211 0.01268744] mean value: 0.019767069816589357 key: test_mcc value: [0.68163138 0.48266935 0.60314831 0.45746799 0.46252711 0.5716838 0.51076836 0.61158096 0.65091508 0.55805107] mean value: 0.5590443404174392 key: train_mcc value: [0.74113148 0.74177387 0.7319774 0.73266902 0.74663937 0.7338776 0.73357097 0.71200359 0.71200359 0.72874952] mean value: 0.7314396423353031 key: test_accuracy value: [0.86813187 0.79120879 0.83516484 0.79120879 0.78021978 0.82222222 0.81111111 0.85555556 0.86666667 0.82222222] mean value: 0.8243711843711844 key: train_accuracy value: [0.8980344 0.8992629 0.8955774 0.8955774 0.9004914 0.89693252 0.89570552 0.88711656 0.88711656 0.89447853] mean value: 0.8950293182195023 key: test_fscore value: [0.76923077 0.62745098 0.71698113 0.59574468 0.61538462 0.69230769 0.63829787 0.69767442 0.73913043 0.68 ] mean value: 0.6772202595969455 key: train_fscore value: [0.81093394 0.81018519 0.80278422 0.80369515 0.81464531 0.8028169 0.8045977 0.78899083 0.78899083 0.8 ] mean value: 0.8027640061671473 key: test_precision value: [0.71428571 0.61538462 0.67857143 0.63636364 0.59259259 0.64285714 0.65217391 0.78947368 0.77272727 0.65384615] mean value: 0.6748276153882561 key: train_precision value: [0.81278539 0.82159624 0.81603774 0.81308411 0.81651376 0.83009709 0.81395349 0.7962963 0.7962963 0.81904762] mean value: 0.8135708029116734 key: test_recall value: [0.83333333 0.64 0.76 0.56 0.64 0.75 0.625 0.625 0.70833333 0.70833333] mean value: 0.685 key: train_recall value: [0.80909091 0.79908676 0.78995434 0.79452055 0.81278539 0.77727273 0.79545455 0.78181818 0.78181818 0.78181818] mean value: 0.7923619759236198 key: test_roc_auc value: [0.85696517 0.74424242 0.81181818 0.71939394 0.73666667 0.79924242 0.75189394 0.78219697 0.81628788 0.78598485] mean value: 0.7804692446856626 key: train_roc_auc value: [0.87003367 0.86761061 0.86220406 0.86364683 0.87277925 0.8592246 0.86411383 0.8539343 0.8539343 0.85897632] mean value: 0.862645775897186 key: test_jcc value: [0.625 0.45714286 0.55882353 0.42424242 0.44444444 0.52941176 0.46875 0.53571429 0.5862069 0.51515152] mean value: 0.5144887717364898 key: train_jcc value: [0.68199234 0.68093385 0.67054264 0.67181467 0.68725869 0.67058824 0.67307692 0.65151515 0.65151515 0.66666667] mean value: 0.6705904312105113 MCC on Blind test: 0.57 Accuracy on Blind test: 0.84 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02526474 0.01550508 0.02460146 0.01528096 0.03058648 0.01548696 0.01551771 0.0158608 0.01576138 0.01571155] mean value: 0.01895771026611328 key: score_time value: [0.01229072 0.01626825 0.01230955 0.01229024 0.02189422 0.01223373 0.01224208 0.01224113 0.01226044 0.01224518] mean value: 0.01362755298614502 key: test_mcc value: [0.53931786 0.54842288 0.51496742 0.49177534 0.46965229 0.51076836 0.54545455 0.64465837 0.48863636 0.50261554] mean value: 0.5256268946384712 key: train_mcc value: [0.51519725 0.53378947 0.52880604 0.53260773 0.55896121 0.54917868 0.54291619 0.52792195 0.55056103 0.54668176] mean value: 0.5386621323453594 key: test_accuracy value: [0.81318681 0.82417582 0.81318681 0.8021978 0.79120879 0.81111111 0.82222222 0.86666667 0.8 0.8 ] mean value: 0.8143956043956044 key: train_accuracy value: [0.80958231 0.81818182 0.81572482 0.81695332 0.82678133 0.82331288 0.8208589 0.81472393 0.82208589 0.82208589] mean value: 0.8190291071886164 key: test_fscore value: [0.66666667 0.66666667 0.63829787 0.625 0.6122449 0.63829787 0.66666667 0.72727273 0.625 0.64 ] mean value: 0.6506113369912762 key: train_fscore value: [0.64530892 0.65740741 0.65437788 0.65747126 0.67734554 0.66972477 0.66513761 0.65446224 0.67268623 0.66819222] mean value: 0.662211409201409 key: test_precision value: [0.62962963 0.69565217 0.68181818 0.65217391 0.625 0.65217391 0.66666667 0.8 0.625 0.61538462] mean value: 0.6643499093499093 key: train_precision value: [0.64976959 0.66666667 0.66046512 0.66203704 0.67889908 0.67592593 0.6712963 0.65898618 0.66816143 0.67281106] mean value: 0.666501838002788 key: test_recall value: [0.70833333 0.64 0.6 0.6 0.6 0.625 0.66666667 0.66666667 0.625 0.66666667] mean value: 0.6398333333333334 key: train_recall value: [0.64090909 0.64840183 0.64840183 0.65296804 0.67579909 0.66363636 0.65909091 0.65 0.67727273 0.66363636] mean value: 0.6580116230801162 key: test_roc_auc value: [0.7795398 0.7669697 0.7469697 0.73939394 0.73181818 0.75189394 0.77272727 0.8030303 0.74431818 0.75757576] mean value: 0.7594236770691994 key: train_roc_auc value: [0.75648148 0.76453705 0.76285638 0.76513948 0.77907601 0.77299465 0.76988159 0.76281513 0.77645149 0.77215432] mean value: 0.768238757243592 key: test_jcc value: [0.5 0.5 0.46875 0.45454545 0.44117647 0.46875 0.5 0.57142857 0.45454545 0.47058824] mean value: 0.4829784186401833 key: train_jcc value: [0.47635135 0.48965517 0.48630137 0.48972603 0.51211073 0.50344828 0.49828179 0.48639456 0.50680272 0.50171821] mean value: 0.4950790202442651 MCC on Blind test: 0.52 Accuracy on Blind test: 0.82 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.0364356 0.02421284 0.03152752 0.05748749 0.02250338 0.02640176 0.023844 0.02884269 0.02368212 0.03187656] mean value: 0.030681395530700685 key: score_time value: [0.01122618 0.01779962 0.03136301 0.0173564 0.01226211 0.01223731 0.01209617 0.01216483 0.01215291 0.01216865] mean value: 0.015082716941833496 key: test_mcc value: [0.29343545 0.59651016 0.60507041 0.55878788 0.46252711 0.60188445 0.42967947 0.64465837 0.68358472 0.29395676] mean value: 0.5170094779682164 key: train_mcc value: [0.30971175 0.71001953 0.72215841 0.75055279 0.72811768 0.6782196 0.43138877 0.66381467 0.71322062 0.45778342] mean value: 0.6164987243986632 key: test_accuracy value: [0.76923077 0.82417582 0.84615385 0.82417582 0.78021978 0.82222222 0.8 0.86666667 0.87777778 0.76666667] mean value: 0.8177289377289377 key: train_accuracy value: [0.76535627 0.86486486 0.89189189 0.8992629 0.88943489 0.86134969 0.79509202 0.87116564 0.87361963 0.80368098] mean value: 0.8515718786270934 key: test_fscore value: [0.27586207 0.71428571 0.70833333 0.68 0.61538462 0.71428571 0.4375 0.72727273 0.76595745 0.32258065] mean value: 0.5961462265497423 key: train_fscore value: [0.24505929 0.78764479 0.79534884 0.81938326 0.80349345 0.76891616 0.39272727 0.74820144 0.79275654 0.44055944] mean value: 0.6594090469875463 key: test_precision value: [0.8 0.64516129 0.73913043 0.68 0.59259259 0.625 0.875 0.8 0.7826087 0.71428571] mean value: 0.725377872763567 key: train_precision value: [0.93939394 0.68227425 0.81042654 0.79148936 0.76987448 0.69888476 0.98181818 0.79187817 0.71119134 0.95454545] mean value: 0.8131776468916367 key: test_recall value: [0.16666667 0.8 0.68 0.68 0.64 0.83333333 0.29166667 0.66666667 0.75 0.20833333] mean value: 0.5716666666666667 key: train_recall value: [0.14090909 0.93150685 0.78082192 0.84931507 0.84018265 0.85454545 0.24545455 0.70909091 0.89545455 0.28636364] mean value: 0.6533644665836447 key: test_roc_auc value: [0.57587065 0.81666667 0.79454545 0.77939394 0.73666667 0.82575758 0.63825758 0.8030303 0.83712121 0.58901515] mean value: 0.7396325192220715 key: train_roc_auc value: [0.56877104 0.88592149 0.85679751 0.88348106 0.87387284 0.8592055 0.62188694 0.82009167 0.88050038 0.64066081] mean value: 0.7891189251402788 key: test_jcc value: [0.16 0.55555556 0.5483871 0.51515152 0.44444444 0.55555556 0.28 0.57142857 0.62068966 0.19230769] mean value: 0.44435200863899416 key: train_jcc value: [0.13963964 0.64968153 0.66023166 0.69402985 0.67153285 0.62458472 0.24434389 0.59770115 0.65666667 0.28251121] mean value: 0.5220923161860291 MCC on Blind test: 0.74 Accuracy on Blind test: 0.89 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.03881431 0.03019404 0.03379011 0.03127623 0.03301024 0.02849579 0.0292964 0.03955412 0.06099486 0.02386975] mean value: 0.03492958545684814 key: score_time value: [0.01225305 0.0122292 0.01222324 0.01225877 0.01216602 0.01220107 0.02749681 0.01217771 0.01320577 0.01216698] mean value: 0.013837862014770507 key: test_mcc value: [0.60569466 0.48174234 0.56049517 0.41303145 0.41497432 0.53935989 0.50376657 0.5154942 0.64241792 0.4268753 ] mean value: 0.5103851815446988 key: train_mcc value: [0.63272729 0.65377037 0.64911594 0.4761944 0.57051517 0.56498372 0.69859831 0.47213543 0.66675209 0.67028614] mean value: 0.6055078840398903 key: test_accuracy value: [0.8021978 0.73626374 0.83516484 0.61538462 0.79120879 0.83333333 0.82222222 0.68888889 0.86666667 0.78888889] mean value: 0.778021978021978 key: train_accuracy value: [0.8022113 0.81941032 0.86855037 0.66830467 0.84152334 0.8392638 0.88588957 0.66134969 0.87484663 0.87607362] mean value: 0.8137423312883436 key: test_fscore value: [0.70967742 0.63636364 0.59459459 0.57831325 0.48648649 0.61538462 0.57894737 0.63157895 0.68421053 0.55813953] mean value: 0.6073696382185204 key: train_fscore value: [0.72572402 0.74165202 0.69859155 0.61206897 0.60790274 0.60182371 0.75067024 0.60906516 0.73298429 0.72922252] mean value: 0.6809705210509759 key: test_precision value: [0.57894737 0.51219512 0.91666667 0.4137931 0.75 0.8 0.78571429 0.46153846 0.92857143 0.63157895] mean value: 0.6779005383679811 key: train_precision value: [0.58038147 0.60285714 0.91176471 0.44654088 0.90909091 0.90825688 0.91503268 0.44238683 0.86419753 0.88888889] mean value: 0.7469397921224509 key: test_recall value: [0.91666667 0.84 0.44 0.96 0.36 0.5 0.45833333 1. 0.54166667 0.5 ] mean value: 0.6516666666666666 key: train_recall value: [0.96818182 0.96347032 0.56621005 0.97260274 0.456621 0.45 0.63636364 0.97727273 0.63636364 0.61818182] mean value: 0.7245267745952677 key: test_roc_auc value: [0.83893035 0.76848485 0.71242424 0.72242424 0.65727273 0.72727273 0.70643939 0.78787879 0.76325758 0.6969697 ] mean value: 0.7381354590682949 key: train_roc_auc value: [0.85446128 0.86492844 0.77302099 0.76445263 0.71990714 0.71659664 0.80725745 0.76090527 0.79969442 0.79480519] mean value: 0.7856029453430742 key: test_jcc value: [0.55 0.46666667 0.42307692 0.40677966 0.32142857 0.44444444 0.40740741 0.46153846 0.52 0.38709677] mean value: 0.4388438909772972 key: train_jcc value: [0.56951872 0.58938547 0.53679654 0.44099379 0.43668122 0.43043478 0.60085837 0.43788187 0.5785124 0.57383966] mean value: 0.5194902824337679 MCC on Blind test: 0.62 Accuracy on Blind test: 0.86 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.27724671 0.26051331 0.26959395 0.26172805 0.26124239 0.25981188 0.26019287 0.25977993 0.25926805 0.26434898] mean value: 0.26337261199951173 key: score_time value: [0.01599622 0.01610565 0.01713943 0.0160625 0.01607919 0.01607633 0.01584888 0.01627922 0.01608992 0.01617551] mean value: 0.016185283660888672 key: test_mcc value: [0.78071437 0.65648795 0.74898796 0.61393939 0.72424242 0.72435769 0.77979322 0.73450514 0.61782299 0.69186077] mean value: 0.7072711911836023 key: train_mcc value: [0.877768 0.90397549 0.90658396 0.91638126 0.90285852 0.89233213 0.90011331 0.87625403 0.86536718 0.88170359] mean value: 0.8923337459745163 key: test_accuracy value: [0.91208791 0.85714286 0.9010989 0.84615385 0.89010989 0.88888889 0.91111111 0.88888889 0.85555556 0.87777778] mean value: 0.8828815628815628 key: train_accuracy value: [0.95208845 0.96191646 0.96314496 0.96683047 0.96068796 0.95705521 0.9607362 0.95092025 0.94601227 0.95337423] mean value: 0.9572766464177507 key: test_fscore value: [0.84 0.75471698 0.81632653 0.72 0.8 0.8 0.84 0.80769231 0.71111111 0.7755102 ] mean value: 0.7865357134629373 key: train_fscore value: [0.91034483 0.93002257 0.93181818 0.93905192 0.92920354 0.92170022 0.92694064 0.90990991 0.90222222 0.91363636] mean value: 0.9214850400078269 key: test_precision value: [0.80769231 0.71428571 0.83333333 0.72 0.8 0.76923077 0.80769231 0.75 0.76190476 0.76 ] mean value: 0.7724139194139195 key: train_precision value: [0.92093023 0.91964286 0.92760181 0.92857143 0.90128755 0.90748899 0.93119266 0.90178571 0.8826087 0.91363636] mean value: 0.9134746302784097 key: test_recall value: [0.875 0.8 0.8 0.72 0.8 0.83333333 0.875 0.875 0.66666667 0.79166667] mean value: 0.8036666666666666 key: train_recall value: [0.9 0.94063927 0.93607306 0.94977169 0.95890411 0.93636364 0.92272727 0.91818182 0.92272727 0.91363636] mean value: 0.9299024491490245 key: test_roc_auc value: [0.90018657 0.83939394 0.86969697 0.8069697 0.86212121 0.87121212 0.89962121 0.8844697 0.79545455 0.85037879] mean value: 0.8579504748982361 key: train_roc_auc value: [0.93569024 0.95519358 0.95459115 0.96144047 0.96012432 0.95053476 0.94875859 0.94060351 0.93867456 0.9408518 ] mean value: 0.9486462985637039 key: test_jcc value: [0.72413793 0.60606061 0.68965517 0.5625 0.66666667 0.66666667 0.72413793 0.67741935 0.55172414 0.63333333] mean value: 0.6502301799979775 key: train_jcc value: [0.83544304 0.86919831 0.87234043 0.88510638 0.8677686 0.85477178 0.86382979 0.83471074 0.82186235 0.84100418] mean value: 0.8546035601309547 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.1903882 0.21745563 0.21988535 0.23056459 0.21335316 0.09320831 0.09771919 0.22602701 0.21265531 0.1860373 ] mean value: 0.1887294054031372 key: score_time value: [0.02607679 0.03801036 0.0424521 0.04715228 0.01995468 0.03384686 0.0404861 0.02912593 0.0286665 0.01805949] mean value: 0.03238310813903809 key: test_mcc value: [0.80485509 0.53716427 0.78588153 0.78588153 0.72424242 0.68023136 0.76553182 0.72435769 0.76634133 0.74119017] mean value: 0.7315677216232281 key: train_mcc value: [0.97503169 0.99063281 0.99063281 0.9843585 0.98122438 0.97190427 0.98441714 0.99377387 0.97817158 0.99377387] mean value: 0.984392091616367 key: test_accuracy value: [0.92307692 0.81318681 0.91208791 0.91208791 0.89010989 0.86666667 0.91111111 0.88888889 0.91111111 0.9 ] mean value: 0.8928327228327229 key: train_accuracy value: [0.99017199 0.9963145 0.9963145 0.99385749 0.99262899 0.98895706 0.99386503 0.99754601 0.99141104 0.99754601] mean value: 0.9938612622661702 key: test_fscore value: [0.85714286 0.66666667 0.84615385 0.84615385 0.8 0.76923077 0.81818182 0.8 0.80952381 0.80851064] mean value: 0.8021564251351486 key: train_fscore value: [0.98173516 0.99310345 0.99310345 0.98850575 0.98623853 0.97940503 0.98861048 0.99545455 0.98390805 0.99545455] mean value: 0.9885518985176559 key: test_precision value: [0.84 0.65384615 0.81481481 0.81481481 0.8 0.71428571 0.9 0.76923077 0.94444444 0.82608696] mean value: 0.8077523667958451 key: train_precision value: [0.98623853 1. 1. 0.99537037 0.99078341 0.98617512 0.99086758 0.99545455 0.99534884 0.99545455] mean value: 0.9935692935853153 key: test_recall value: [0.875 0.68 0.88 0.88 0.8 0.83333333 0.75 0.83333333 0.70833333 0.79166667] mean value: 0.8031666666666667 key: train_recall value: [0.97727273 0.98630137 0.98630137 0.98173516 0.98173516 0.97272727 0.98636364 0.99545455 0.97272727 0.99545455] mean value: 0.983607305936073 key: test_roc_auc value: [0.90764925 0.77181818 0.90212121 0.90212121 0.86212121 0.85606061 0.85984848 0.87121212 0.84659091 0.8655303 ] mean value: 0.8645073496155585 key: train_roc_auc value: [0.98611111 0.99315068 0.99315068 0.99002724 0.98918691 0.98384263 0.99150115 0.99688694 0.9855233 0.99688694] mean value: 0.9906267579676121 key: test_jcc value: [0.75 0.5 0.73333333 0.73333333 0.66666667 0.625 0.69230769 0.66666667 0.68 0.67857143] mean value: 0.6725879120879121 key: train_jcc value: [0.96412556 0.98630137 0.98630137 0.97727273 0.97285068 0.95964126 0.97747748 0.99095023 0.96832579 0.99095023] mean value: 0.9774196683696653 MCC on Blind test: 0.73 Accuracy on Blind test: 0.89 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.41849947 0.58950043 0.76989412 0.91573668 0.42887688 0.46664572 0.4057734 0.81402564 0.99178576 0.49223542] mean value: 0.6292973518371582 key: score_time value: [0.09931445 0.02061701 0.05184293 0.03578353 0.02096629 0.0361588 0.03552747 0.05565429 0.020648 0.04811263] mean value: 0.04246253967285156 key: test_mcc value: [0.53958171 0.24362069 0.31156172 0.28426762 0.41497432 0.31195619 0.35478744 0.5063517 0.42640143 0.32401531] mean value: 0.3717518137838298 key: train_mcc value: [0.88453181 0.87827 0.88766131 0.88416629 0.89358226 0.89428983 0.88144713 0.88144713 0.86891849 0.87797787] mean value: 0.8832292128981972 key: test_accuracy value: [0.83516484 0.74725275 0.75824176 0.75824176 0.79120879 0.76666667 0.77777778 0.82222222 0.8 0.76666667] mean value: 0.7823443223443224 key: train_accuracy value: [0.95454545 0.95208845 0.95577396 0.95454545 0.95823096 0.95828221 0.95337423 0.95337423 0.94846626 0.95214724] mean value: 0.9540828446963416 key: test_fscore value: [0.59459459 0.3030303 0.42105263 0.3125 0.48648649 0.4 0.44444444 0.52941176 0.47058824 0.43243243] mean value: 0.43945408925672086 key: train_fscore value: [0.90864198 0.90225564 0.91044776 0.90818859 0.91625616 0.91625616 0.90594059 0.90594059 0.895 0.9037037 ] mean value: 0.9072631168301808 key: test_precision value: [0.84615385 0.625 0.61538462 0.71428571 0.75 0.63636364 0.66666667 0.9 0.8 0.61538462] mean value: 0.7169239094239095 key: train_precision value: [0.99459459 1. 1. 0.99456522 0.99465241 1. 0.99456522 0.99456522 0.99444444 0.98918919] mean value: 0.9956576286819253 key: test_recall value: [0.45833333 0.2 0.32 0.2 0.36 0.29166667 0.33333333 0.375 0.33333333 0.33333333] mean value: 0.3205 key: train_recall value: [0.83636364 0.82191781 0.83561644 0.83561644 0.84931507 0.84545455 0.83181818 0.83181818 0.81363636 0.83181818] mean value: 0.8333374844333749 key: test_roc_auc value: [0.71424129 0.57727273 0.62212121 0.58484848 0.65727273 0.6155303 0.63636364 0.67992424 0.65151515 0.62878788] mean value: 0.6367877657168702 key: train_roc_auc value: [0.91734007 0.9109589 0.91780822 0.91696788 0.9238172 0.92272727 0.91506875 0.91506875 0.90597785 0.91422842] mean value: 0.9159963318383947 key: test_jcc value: [0.42307692 0.17857143 0.26666667 0.18518519 0.32142857 0.25 0.28571429 0.36 0.30769231 0.27586207] mean value: 0.28541974373008855 key: train_jcc value: [0.83257919 0.82191781 0.83561644 0.83181818 0.84545455 0.84545455 0.8280543 0.8280543 0.80995475 0.82432432] mean value: 0.8303228377563591 MCC on Blind test: 0.3 Accuracy on Blind test: 0.77 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [1.28913355 1.31570816 1.21427155 1.3062737 1.31983781 1.22301054 1.25804472 1.31995225 1.30364299 1.22600722] mean value: 1.2775882482528687 key: score_time value: [0.010149 0.0146482 0.01436591 0.01001978 0.01055002 0.01526237 0.01032352 0.00997353 0.01475406 0.00955248] mean value: 0.01195988655090332 key: test_mcc value: [0.83591639 0.7098276 0.78588153 0.79423548 0.80485509 0.78877892 0.79879562 0.81147376 0.76784594 0.81147376] mean value: 0.7909084087989999 key: train_mcc value: [0.99067471 0.98750624 0.98748946 0.9843585 0.98133231 0.98754775 1. 0.99067895 0.98132162 0.99377387] mean value: 0.9884683404995436 key: test_accuracy value: [0.93406593 0.87912088 0.91208791 0.91208791 0.92307692 0.91111111 0.92222222 0.92222222 0.91111111 0.92222222] mean value: 0.914932844932845 key: train_accuracy value: [0.9963145 0.995086 0.995086 0.99385749 0.99262899 0.99509202 1. 0.99631902 0.99263804 0.99754601] mean value: 0.9954568064997513 key: test_fscore value: [0.88 0.79245283 0.84615385 0.85185185 0.85714286 0.84615385 0.85106383 0.8627451 0.82608696 0.8627451 ] mean value: 0.8476396213878485 key: train_fscore value: [0.99319728 0.99086758 0.99082569 0.98850575 0.98636364 0.99090909 1. 0.99319728 0.98636364 0.99545455] mean value: 0.9915684482022545 key: test_precision value: [0.84615385 0.75 0.81481481 0.79310345 0.875 0.78571429 0.86956522 0.81481481 0.86363636 0.81481481] mean value: 0.8227617605616107 key: train_precision value: [0.99095023 0.99086758 0.99539171 0.99537037 0.98190045 0.99090909 1. 0.99095023 0.98636364 0.99545455] mean value: 0.9918157833052819 key: test_recall value: [0.91666667 0.84 0.88 0.92 0.84 0.91666667 0.83333333 0.91666667 0.79166667 0.91666667] mean value: 0.8771666666666667 key: train_recall value: [0.99545455 0.99086758 0.98630137 0.98173516 0.99086758 0.99090909 1. 0.99545455 0.98636364 0.99545455] mean value: 0.9913408053134081 key: test_roc_auc value: [0.92848259 0.8669697 0.90212121 0.91454545 0.89727273 0.91287879 0.89393939 0.92045455 0.87310606 0.92045455] mean value: 0.90302250113071 key: train_roc_auc value: [0.99604377 0.99375312 0.99231035 0.99002724 0.99207245 0.99377387 1. 0.9960466 0.99066081 0.99688694] mean value: 0.9941575146732279 key: test_jcc value: [0.78571429 0.65625 0.73333333 0.74193548 0.75 0.73333333 0.74074074 0.75862069 0.7037037 0.75862069] mean value: 0.736225226000671 key: train_jcc value: [0.98648649 0.98190045 0.98181818 0.97727273 0.97309417 0.98198198 1. 0.98648649 0.97309417 0.99095023] mean value: 0.9833084883586071 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.14145827 0.06715274 0.04703712 0.0477314 0.07376885 0.05179882 0.08239913 0.05850697 0.14945626 0.13114095] mean value: 0.08504505157470703 key: score_time value: [0.013623 0.01364827 0.01348495 0.01350093 0.0139091 0.01341081 0.01348758 0.01463079 0.01364398 0.0333693 ] mean value: 0.01567087173461914 key: test_mcc value: [ 0.13146936 0.08884904 -0.20842878 0.21624971 -0.10050378 -0.15853511 0.14830704 0.06043672 -0.1571382 0.16261091] mean value: 0.018331691626637253 key: train_mcc value: [0.20644118 0.25718751 0.20851971 0.18754191 0.19327336 0.20749178 0.1992175 0.19500127 0.20061074 0.18198662] mean value: 0.2037271576439099 key: test_accuracy value: [0.38461538 0.40659341 0.26373626 0.38461538 0.30769231 0.27777778 0.36666667 0.31111111 0.28888889 0.37777778] mean value: 0.3369474969474969 key: train_accuracy value: [0.37346437 0.42137592 0.37469287 0.35626536 0.36117936 0.37423313 0.36687117 0.36319018 0.36809816 0.35214724] mean value: 0.37115177642785 key: test_fscore value: [0.44 0.4375 0.37383178 0.47169811 0.38834951 0.36893204 0.44660194 0.42592593 0.36 0.45098039] mean value: 0.41638197021369017 key: train_fscore value: [0.46315789 0.48184818 0.4625132 0.45530146 0.45720251 0.46315789 0.46025105 0.45881126 0.46073298 0.45454545] mean value: 0.4617521880985164 key: test_precision value: [0.28947368 0.29577465 0.24390244 0.30864198 0.25641026 0.24050633 0.29113924 0.27380952 0.23684211 0.29487179] mean value: 0.27313719964058686 key: train_precision value: [0.30136986 0.3173913 0.30082418 0.29475101 0.29634641 0.30136986 0.29891304 0.29769959 0.29931973 0.29411765] mean value: 0.3002102642167985 key: test_recall value: [0.91666667 0.84 0.8 1. 0.8 0.79166667 0.95833333 0.95833333 0.75 0.95833333] mean value: 0.8773333333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.55534826 0.54121212 0.43030303 0.57575758 0.46060606 0.44128788 0.55492424 0.51704545 0.43560606 0.5625 ] mean value: 0.5074590682948892 key: train_roc_auc value: [0.57070707 0.60420168 0.57226891 0.55966387 0.56302521 0.57142857 0.56638655 0.56386555 0.56722689 0.55630252] mean value: 0.5695076818606231 key: test_jcc value: [0.28205128 0.28 0.22988506 0.30864198 0.24096386 0.22619048 0.2875 0.27058824 0.2195122 0.29113924] mean value: 0.26364723173657495 key: train_jcc value: [0.30136986 0.3173913 0.30082418 0.29475101 0.29634641 0.30136986 0.29891304 0.29769959 0.29931973 0.29411765] mean value: 0.3002102642167985 MCC on Blind test: 0.1 Accuracy on Blind test: 0.36 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.0502224 0.06870103 0.07678366 0.05153155 0.07788491 0.06945562 0.07719898 0.06175971 0.06251836 0.01799321] mean value: 0.06140494346618652 key: score_time value: [0.03146505 0.02858329 0.03003502 0.03028297 0.02979541 0.02962589 0.03054166 0.03654528 0.01298213 0.02162504] mean value: 0.028148174285888672 key: test_mcc value: [0.66044776 0.58138656 0.69312083 0.53181644 0.42817442 0.57966713 0.45226702 0.64465837 0.73663511 0.48863636] mean value: 0.579680999444377 key: train_mcc value: [0.71598517 0.71445751 0.70526551 0.71583656 0.72435597 0.69503973 0.71752285 0.6986792 0.69083097 0.70780065] mean value: 0.7085774124854681 key: test_accuracy value: [0.86813187 0.83516484 0.87912088 0.82417582 0.76923077 0.83333333 0.8 0.86666667 0.9 0.8 ] mean value: 0.8375824175824176 key: train_accuracy value: [0.88943489 0.88943489 0.88574939 0.88943489 0.89189189 0.88220859 0.88957055 0.88220859 0.8797546 0.88711656] mean value: 0.8866804841651468 key: test_fscore value: [0.75 0.69387755 0.7755102 0.63636364 0.58823529 0.69387755 0.57142857 0.72727273 0.8 0.625 ] mean value: 0.6861565535305031 key: train_fscore value: [0.79069767 0.78873239 0.78220141 0.79069767 0.79816514 0.77358491 0.79262673 0.77880184 0.77209302 0.78301887] mean value: 0.7850619654239601 key: test_precision value: [0.75 0.70833333 0.79166667 0.73684211 0.57692308 0.68 0.66666667 0.8 0.85714286 0.625 ] mean value: 0.7192574705995759 key: train_precision value: [0.80952381 0.8115942 0.80288462 0.8056872 0.80184332 0.80392157 0.80373832 0.78971963 0.79047619 0.81372549] mean value: 0.8033114342795749 key: test_recall value: [0.75 0.68 0.76 0.56 0.6 0.70833333 0.5 0.66666667 0.75 0.625 ] mean value: 0.66 key: train_recall value: [0.77272727 0.76712329 0.76255708 0.77625571 0.79452055 0.74545455 0.78181818 0.76818182 0.75454545 0.75454545] mean value: 0.7677729348277293 key: test_roc_auc value: [0.83022388 0.7869697 0.84212121 0.74212121 0.71666667 0.79356061 0.70454545 0.8030303 0.85227273 0.74431818] mean value: 0.7815829941203075 key: train_roc_auc value: [0.8526936 0.85078853 0.84682476 0.85367407 0.86112582 0.83911383 0.85561497 0.84627578 0.84029794 0.84533995] mean value: 0.8491749262317353 key: test_jcc value: [0.6 0.53125 0.63333333 0.46666667 0.41666667 0.53125 0.4 0.57142857 0.66666667 0.45454545] mean value: 0.5271807359307359 key: train_jcc value: [0.65384615 0.65116279 0.64230769 0.65384615 0.66412214 0.63076923 0.65648855 0.63773585 0.62878788 0.64341085] mean value: 0.6462477289047467 MCC on Blind test: 0.59 Accuracy on Blind test: 0.85 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.44820738 0.80184031 0.63607836 0.53813601 0.51265216 0.4790895 0.677001 0.53256416 0.59983206 0.68059635] mean value: 0.5905997276306152 key: score_time value: [0.04019618 0.02686477 0.02605987 0.03050518 0.0192945 0.04190254 0.01563525 0.03461552 0.02633142 0.03640819] mean value: 0.029781341552734375 key: test_mcc value: [0.66044776 0.52551942 0.70064905 0.53181644 0.42817442 0.57966713 0.4792982 0.64465837 0.73663511 0.48863636] mean value: 0.5775502262717324 key: train_mcc value: [0.71598517 0.72307926 0.72431999 0.71583656 0.72435597 0.69503973 0.72452859 0.6986792 0.70276203 0.70780065] mean value: 0.7132387140179451 key: test_accuracy value: [0.86813187 0.81318681 0.87912088 0.82417582 0.76923077 0.83333333 0.81111111 0.86666667 0.9 0.8 ] mean value: 0.8364957264957265 key: train_accuracy value: [0.88943489 0.89312039 0.89312039 0.88943489 0.89189189 0.88220859 0.89202454 0.88220859 0.88466258 0.88711656] mean value: 0.8885223315898163 key: test_fscore value: [0.75 0.65306122 0.78431373 0.63636364 0.58823529 0.69387755 0.58536585 0.72727273 0.8 0.625 ] mean value: 0.6843490012412947 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:115: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:118: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.79069767 0.79432624 0.79625293 0.79069767 0.79816514 0.77358491 0.79816514 0.77880184 0.78037383 0.78301887] mean value: 0.7884084241280366 key: test_precision value: [0.75 0.66666667 0.76923077 0.73684211 0.57692308 0.68 0.70588235 0.8 0.85714286 0.625 ] mean value: 0.7167687828167705 key: train_precision value: [0.80952381 0.82352941 0.81730769 0.8056872 0.80184332 0.80392157 0.80555556 0.78971963 0.80288462 0.81372549] mean value: 0.8073698291291952 key: test_recall value: [0.75 0.64 0.8 0.56 0.6 0.70833333 0.5 0.66666667 0.75 0.625 ] mean value: 0.66 key: train_recall value: [0.77272727 0.76712329 0.77625571 0.77625571 0.79452055 0.74545455 0.79090909 0.76818182 0.75909091 0.75454545] mean value: 0.7705064342050643 key: test_roc_auc value: [0.83022388 0.75939394 0.85454545 0.74212121 0.71666667 0.79356061 0.71212121 0.8030303 0.85227273 0.74431818] mean value: 0.7808254183627318 key: train_roc_auc value: [0.8526936 0.85330954 0.85619508 0.85367407 0.86112582 0.83911383 0.86016043 0.84627578 0.84509167 0.84533995] mean value: 0.8512979784414112 key: test_jcc value: [0.6 0.48484848 0.64516129 0.46666667 0.41666667 0.53125 0.4137931 0.57142857 0.66666667 0.45454545] mean value: 0.5251026904593368 key: train_jcc value: [0.65384615 0.65882353 0.6614786 0.65384615 0.66412214 0.63076923 0.66412214 0.63773585 0.63984674 0.64341085] mean value: 0.6508001386969055 MCC on Blind test: 0.59 Accuracy on Blind test: 0.85 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.08875203 0.17942286 0.20372772 0.21905112 0.11756659 0.10055804 0.20381188 0.14366698 0.14148211 0.22374272] mean value: 0.1621782064437866 key: score_time value: [0.02979755 0.01557469 0.02248502 0.03827453 0.02860117 0.0163877 0.01603985 0.01669645 0.0194993 0.02031589] mean value: 0.022367215156555174 key: test_mcc value: [0.77892275 0.73050958 0.74942473 0.82425939 0.71319972 0.7800135 0.85201287 0.78932976 0.86452993 0.74456392] mean value: 0.7826766159113144 key: train_mcc value: [0.81677727 0.82356215 0.81538413 0.80551688 0.8115084 0.81720389 0.8018788 0.81804388 0.81099596 0.81850391] mean value: 0.8139375262794872 key: test_accuracy value: [0.88721805 0.86466165 0.87121212 0.90909091 0.84848485 0.88636364 0.92424242 0.89393939 0.93181818 0.87121212] mean value: 0.8888243335611756 key: train_accuracy value: [0.90664424 0.91000841 0.90588235 0.90084034 0.90420168 0.90672269 0.89915966 0.90672269 0.90336134 0.90756303] mean value: 0.9051106430797718 key: test_fscore value: [0.89208633 0.86956522 0.87943262 0.91428571 0.8630137 0.89361702 0.92753623 0.89705882 0.93333333 0.87591241] mean value: 0.8945841404138405 key: train_fscore value: [0.91084337 0.91391794 0.91011236 0.90544872 0.90821256 0.91098637 0.90369181 0.91141261 0.90807354 0.91157556] mean value: 0.9094274846536654 key: test_precision value: [0.84931507 0.84507042 0.82666667 0.86486486 0.7875 0.84 0.88888889 0.87142857 0.91304348 0.84507042] mean value: 0.8531848383673435 key: train_precision value: [0.87230769 0.8751926 0.87096774 0.86523737 0.87171561 0.87116564 0.86482335 0.86778116 0.86585366 0.87365177] mean value: 0.8698696593137184 key: test_recall value: [0.93939394 0.89552239 0.93939394 0.96969697 0.95454545 0.95454545 0.96969697 0.92424242 0.95454545 0.90909091] mean value: 0.9410673903211217 key: train_recall value: [0.95294118 0.95622896 0.95294118 0.94957983 0.94789916 0.95462185 0.94621849 0.95966387 0.95462185 0.95294118] mean value: 0.9527657527657527 key: test_roc_auc value: [0.88760742 0.86442786 0.87121212 0.90909091 0.84848485 0.88636364 0.92424242 0.89393939 0.93181818 0.87121212] mean value: 0.8888398914518317 key: train_roc_auc value: [0.90660527 0.91004725 0.90588235 0.90084034 0.90420168 0.90672269 0.89915966 0.90672269 0.90336134 0.90756303] mean value: 0.9051106301106302 key: test_jcc value: [0.80519481 0.76923077 0.78481013 0.84210526 0.75903614 0.80769231 0.86486486 0.81333333 0.875 0.77922078] mean value: 0.8100488393855346 key: train_jcc value: [0.83628319 0.84148148 0.83505155 0.8272328 0.83185841 0.8365243 0.82430454 0.8372434 0.83162518 0.83751846] mean value: 0.8339123305107486 MCC on Blind test: 0.72 Accuracy on Blind test: 0.89 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [3.9267838 4.17609477 3.35307336 4.36423445 3.8363719 3.76100397 3.2580955 3.22691202 2.85449815 2.63607287] mean value: 3.539314079284668 key: score_time value: [0.02041101 0.02339602 0.01942444 0.03063869 0.0130496 0.01654649 0.01253414 0.02452779 0.01980877 0.0202508 ] mean value: 0.02005877494812012 key: test_mcc value: [0.79098805 0.75938489 0.76072577 0.82425939 0.77521709 0.79373126 0.8196886 0.78932976 0.84887469 0.74456392] mean value: 0.7906763433813555 key: train_mcc value: [0.83386511 0.84062945 0.83242875 0.8308463 0.82926583 0.82747573 0.82867279 0.83669268 0.8646291 0.83916906] mean value: 0.8363674787603758 key: test_accuracy value: [0.89473684 0.87969925 0.87878788 0.90909091 0.87878788 0.89393939 0.90909091 0.89393939 0.92424242 0.87121212] mean value: 0.8933526999316472 key: train_accuracy value: [0.91589571 0.91925988 0.91512605 0.91428571 0.91344538 0.91260504 0.91344538 0.91680672 0.93193277 0.91848739] mean value: 0.9171290046716752 key: test_fscore value: [0.89705882 0.88059701 0.88405797 0.91428571 0.89041096 0.9 0.91176471 0.89705882 0.92537313 0.87591241] mean value: 0.8976519555158349 key: train_fscore value: [0.91883117 0.92195122 0.91808597 0.91734198 0.91659919 0.91572123 0.91619203 0.92022562 0.93333333 0.92133009] mean value: 0.9199611829964236 key: test_precision value: [0.87142857 0.88059701 0.84722222 0.86486486 0.8125 0.85135135 0.88571429 0.87142857 0.91176471 0.84507042] mean value: 0.8641942010352804 key: train_precision value: [0.88854003 0.89150943 0.88714734 0.885759 0.884375 0.88419405 0.88801262 0.88390093 0.91451613 0.89028213] mean value: 0.8898236660208628 key: test_recall value: [0.92424242 0.88059701 0.92424242 0.96969697 0.98484848 0.95454545 0.93939394 0.92424242 0.93939394 0.90909091] mean value: 0.9350293984622343 key: train_recall value: [0.9512605 0.95454545 0.9512605 0.9512605 0.9512605 0.94957983 0.94621849 0.95966387 0.95294118 0.95462185] mean value: 0.9522612681436211 key: test_roc_auc value: [0.89495703 0.87969245 0.87878788 0.90909091 0.87878788 0.89393939 0.90909091 0.89393939 0.92424242 0.87121212] mean value: 0.893374038896427 key: train_roc_auc value: [0.91586594 0.91928953 0.91512605 0.91428571 0.91344538 0.91260504 0.91344538 0.91680672 0.93193277 0.91848739] mean value: 0.917128993011346 key: test_jcc value: [0.81333333 0.78666667 0.79220779 0.84210526 0.80246914 0.81818182 0.83783784 0.81333333 0.86111111 0.77922078] mean value: 0.8146467070853036 key: train_jcc value: [0.84984985 0.85520362 0.84857571 0.84730539 0.84603886 0.8445441 0.84534535 0.85223881 0.875 0.85413534] mean value: 0.8518237020427452 MCC on Blind test: 0.7 Accuracy on Blind test: 0.88 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01902628 0.02021194 0.01614714 0.01634669 0.01632619 0.01641297 0.01635528 0.0163734 0.01630759 0.01628184] mean value: 0.016978931427001954 key: score_time value: [0.01291609 0.01158118 0.01146913 0.01135659 0.01135349 0.01156807 0.01147318 0.01137495 0.01159525 0.01156211] mean value: 0.011625003814697266 key: test_mcc value: [0.50384441 0.57928072 0.5611861 0.54645907 0.48574139 0.60633906 0.47018295 0.59152048 0.65521076 0.47412585] mean value: 0.5473890791718706 key: train_mcc value: [0.55295752 0.54784708 0.54518576 0.54641608 0.56334355 0.54310572 0.56322889 0.53467206 0.55645658 0.55167283] mean value: 0.5504886058689347 key: test_accuracy value: [0.7518797 0.78947368 0.78030303 0.77272727 0.74242424 0.8030303 0.73484848 0.79545455 0.82575758 0.73484848] mean value: 0.7730747322852586 key: train_accuracy value: [0.77628259 0.77375946 0.77226891 0.77310924 0.78151261 0.77142857 0.78151261 0.76722689 0.77815126 0.77563025] mean value: 0.7750882388279113 key: test_fscore value: [0.7518797 0.78787879 0.7751938 0.765625 0.734375 0.8 0.72868217 0.8 0.83453237 0.71544715] mean value: 0.7693613984691421 key: train_fscore value: [0.77226027 0.76949443 0.76658053 0.77001704 0.77777778 0.76791809 0.77853492 0.76385337 0.7755102 0.77120823] mean value: 0.7713154861523569 key: test_precision value: [0.74626866 0.8 0.79365079 0.79032258 0.75806452 0.8125 0.74603175 0.7826087 0.79452055 0.77192982] mean value: 0.7795897361331934 key: train_precision value: [0.78708551 0.78359511 0.78621908 0.7806563 0.79130435 0.77989601 0.78929188 0.77508651 0.7848537 0.78671329] mean value: 0.7844701750183688 key: test_recall value: [0.75757576 0.7761194 0.75757576 0.74242424 0.71212121 0.78787879 0.71212121 0.81818182 0.87878788 0.66666667] mean value: 0.7609452736318408 key: train_recall value: [0.75798319 0.75589226 0.74789916 0.75966387 0.76470588 0.75630252 0.76806723 0.75294118 0.76638655 0.75630252] mean value: 0.7586144356732591 key: test_roc_auc value: [0.75192221 0.78957485 0.78030303 0.77272727 0.74242424 0.8030303 0.73484848 0.79545455 0.82575758 0.73484848] mean value: 0.7730890999547716 key: train_roc_auc value: [0.77629799 0.77374445 0.77226891 0.77310924 0.78151261 0.77142857 0.78151261 0.76722689 0.77815126 0.77563025] mean value: 0.7750882777353365 key: test_jcc value: [0.60240964 0.65 0.63291139 0.62025316 0.58024691 0.66666667 0.57317073 0.66666667 0.71604938 0.55696203] mean value: 0.6265336582169645 key: train_jcc value: [0.62900976 0.62534819 0.62150838 0.62603878 0.63636364 0.6232687 0.63737796 0.61793103 0.63333333 0.62761506] mean value: 0.6277794842107693 MCC on Blind test: 0.51 Accuracy on Blind test: 0.8 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01857138 0.01959419 0.01895595 0.01894522 0.01899338 0.01901722 0.01656079 0.01675606 0.01655316 0.02041316] mean value: 0.01843605041503906 key: score_time value: [0.01313376 0.01306129 0.0130434 0.01314616 0.01308227 0.01307869 0.01137018 0.0112958 0.01129675 0.01233172] mean value: 0.012484002113342284 key: test_mcc value: [0.68631183 0.74631701 0.63636364 0.61313934 0.54950626 0.70511024 0.73267501 0.71220297 0.68568568 0.5611861 ] mean value: 0.6628498064589106 key: train_mcc value: [0.67197059 0.68369068 0.67740153 0.66050973 0.68212887 0.67495193 0.67082886 0.65736006 0.68191117 0.66546459] mean value: 0.6726218014876005 key: test_accuracy value: [0.84210526 0.87218045 0.81818182 0.8030303 0.77272727 0.84848485 0.86363636 0.85606061 0.84090909 0.78030303] mean value: 0.8297619047619047 key: train_accuracy value: [0.83515559 0.84020185 0.83781513 0.82857143 0.8394958 0.83613445 0.83445378 0.82689076 0.8394958 0.83109244] mean value: 0.8349307023061537 key: test_fscore value: [0.84671533 0.87769784 0.81818182 0.81690141 0.78571429 0.85915493 0.87142857 0.85714286 0.84892086 0.7751938 ] mean value: 0.8357051702448439 key: train_fscore value: [0.84090909 0.84751204 0.84347121 0.8368 0.8468324 0.84312148 0.84048583 0.83546326 0.84658635 0.83907126] mean value: 0.8420252907043897 key: test_precision value: [0.81690141 0.84722222 0.81818182 0.76315789 0.74324324 0.80263158 0.82432432 0.85074627 0.80821918 0.79365079] mean value: 0.8068278730496224 key: train_precision value: [0.81318681 0.80981595 0.81504702 0.79847328 0.80981595 0.80864198 0.8109375 0.79604262 0.81076923 0.80122324] mean value: 0.8073953585042138 key: test_recall value: [0.87878788 0.91044776 0.81818182 0.87878788 0.83333333 0.92424242 0.92424242 0.86363636 0.89393939 0.75757576] mean value: 0.8683175033921302 key: train_recall value: [0.87058824 0.88888889 0.87394958 0.8789916 0.88739496 0.88067227 0.87226891 0.8789916 0.88571429 0.88067227] mean value: 0.879813258636788 key: test_roc_auc value: [0.84237901 0.87189055 0.81818182 0.8030303 0.77272727 0.84848485 0.86363636 0.85606061 0.84090909 0.78030303] mean value: 0.8297602894617819 key: train_roc_auc value: [0.83512577 0.84024276 0.83781513 0.82857143 0.8394958 0.83613445 0.83445378 0.82689076 0.8394958 0.83109244] mean value: 0.8349318111082817 key: test_jcc value: [0.73417722 0.78205128 0.69230769 0.69047619 0.64705882 0.75308642 0.7721519 0.75 0.7375 0.63291139] mean value: 0.7191720914446778 key: train_jcc value: [0.7254902 0.73537604 0.72931276 0.71939477 0.73435327 0.72878999 0.72486034 0.71742112 0.73398329 0.72275862] mean value: 0.7271740398801881 MCC on Blind test: 0.59 Accuracy on Blind test: 0.83 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01500893 0.01498532 0.01499057 0.01498437 0.01529908 0.01477098 0.01468444 0.01380014 0.01486588 0.01469088] mean value: 0.014808058738708496 key: score_time value: [0.04966807 0.02881622 0.03009772 0.03811646 0.02876663 0.03211856 0.02854252 0.03201628 0.02830029 0.02785945] mean value: 0.03243021965026856 key: test_mcc value: [0.73247531 0.67649057 0.57602211 0.73576721 0.69986771 0.63002408 0.68219104 0.68378319 0.66943868 0.67445327] mean value: 0.6760513177693777 key: train_mcc value: [0.76895612 0.78591503 0.79045033 0.76630612 0.78213811 0.77933874 0.77849745 0.76607208 0.77720446 0.78499714] mean value: 0.7779875592649617 key: test_accuracy value: [0.86466165 0.83458647 0.78787879 0.86363636 0.84848485 0.81060606 0.83333333 0.84090909 0.83333333 0.83333333] mean value: 0.8350763271815903 key: train_accuracy value: [0.88225399 0.89066442 0.89327731 0.88151261 0.88907563 0.88823529 0.88739496 0.88067227 0.88655462 0.8907563 ] mean value: 0.8870397410436 key: test_fscore value: [0.86956522 0.84722222 0.78461538 0.87323944 0.85507246 0.82517483 0.84931507 0.84671533 0.84057971 0.84507042] mean value: 0.8436570079432013 key: train_fscore value: [0.88835726 0.89616613 0.89831865 0.88674699 0.8944 0.89282836 0.89262821 0.88694268 0.89208633 0.89566613] mean value: 0.8924140740905642 key: test_precision value: [0.83333333 0.79220779 0.796875 0.81578947 0.81944444 0.76623377 0.775 0.81690141 0.80555556 0.78947368] mean value: 0.8010814458120333 key: train_precision value: [0.84522003 0.85258359 0.85779817 0.84923077 0.85343511 0.85758514 0.85298622 0.84266263 0.85060976 0.85714286] mean value: 0.8519254268239733 key: test_recall value: [0.90909091 0.91044776 0.77272727 0.93939394 0.89393939 0.89393939 0.93939394 0.87878788 0.87878788 0.90909091] mean value: 0.8925599276345545 key: train_recall value: [0.93613445 0.94444444 0.94285714 0.92773109 0.9394958 0.93109244 0.93613445 0.93613445 0.93781513 0.93781513] mean value: 0.9369654528478057 key: test_roc_auc value: [0.86499322 0.83401176 0.78787879 0.86363636 0.84848485 0.81060606 0.83333333 0.84090909 0.83333333 0.83333333] mean value: 0.835052012663953 key: train_roc_auc value: [0.88220864 0.89070962 0.89327731 0.88151261 0.88907563 0.88823529 0.88739496 0.88067227 0.88655462 0.8907563 ] mean value: 0.8870397249809014 key: test_jcc value: [0.76923077 0.73493976 0.64556962 0.775 0.74683544 0.70238095 0.73809524 0.73417722 0.725 0.73170732] mean value: 0.7302936314297288 key: train_jcc value: [0.79913917 0.81186686 0.81540698 0.7965368 0.8089725 0.80640466 0.80607815 0.79685265 0.80519481 0.81104651] mean value: 0.8057499073390894 MCC on Blind test: 0.42 Accuracy on Blind test: 0.76 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.09179759 0.09042072 0.09019899 0.09118199 0.08997703 0.08817911 0.09113765 0.08191276 0.09158206 0.09061122] mean value: 0.08969991207122803 key: score_time value: [0.03039932 0.03026342 0.02985764 0.03125 0.03003645 0.0298965 0.03040457 0.02954245 0.03102684 0.0316658 ] mean value: 0.030434298515319824 key: test_mcc value: [0.71848125 0.73440686 0.70214689 0.74420841 0.73960026 0.73960026 0.80123362 0.77711043 0.83806027 0.73267501] mean value: 0.7527523272092836 key: train_mcc value: [0.7802016 0.7983579 0.79206602 0.78188958 0.7932249 0.80187807 0.7875319 0.79470566 0.78980614 0.79509686] mean value: 0.7914758627965253 key: test_accuracy value: [0.85714286 0.86466165 0.84848485 0.86363636 0.86363636 0.86363636 0.89393939 0.88636364 0.91666667 0.86363636] mean value: 0.8721804511278195 key: train_accuracy value: [0.88561817 0.89486964 0.89243697 0.88739496 0.89327731 0.89747899 0.88991597 0.89327731 0.8907563 0.89411765] mean value: 0.8919143267062923 key: test_fscore value: [0.86330935 0.87323944 0.85714286 0.87671233 0.875 0.875 0.90277778 0.89208633 0.92086331 0.87142857] mean value: 0.8807559964541803 key: train_fscore value: [0.89375 0.90196078 0.8992126 0.89448819 0.89976322 0.90378549 0.89709348 0.90039216 0.89811912 0.90063091] mean value: 0.8989195954794376 key: test_precision value: [0.82191781 0.82666667 0.81081081 0.8 0.80769231 0.80769231 0.83333333 0.84931507 0.87671233 0.82432432] mean value: 0.8258464955999203 key: train_precision value: [0.8350365 0.84434655 0.84592593 0.84148148 0.84821429 0.85141159 0.84218289 0.84411765 0.84140969 0.84843982] mean value: 0.8442566379798555 key: test_recall value: [0.90909091 0.92537313 0.90909091 0.96969697 0.95454545 0.95454545 0.98484848 0.93939394 0.96969697 0.92424242] mean value: 0.9440524649479873 key: train_recall value: [0.96134454 0.96801347 0.95966387 0.95462185 0.95798319 0.96302521 0.95966387 0.96470588 0.96302521 0.95966387] mean value: 0.9611710947005065 key: test_roc_auc value: [0.85753053 0.86420172 0.84848485 0.86363636 0.86363636 0.86363636 0.89393939 0.88636364 0.91666667 0.86363636] mean value: 0.8721732247851651 key: train_roc_auc value: [0.88555442 0.8949311 0.89243697 0.88739496 0.89327731 0.89747899 0.88991597 0.89327731 0.8907563 0.89411765] mean value: 0.8919140989729225 key: test_jcc value: [0.75949367 0.775 0.75 0.7804878 0.77777778 0.77777778 0.82278481 0.80519481 0.85333333 0.7721519 ] mean value: 0.7874001878708579 key: train_jcc value: [0.8079096 0.82142857 0.81688126 0.80911681 0.81779053 0.82446043 0.81339031 0.81883024 0.81507824 0.81922525] mean value: 0.8164111249615581 MCC on Blind test: 0.65 Accuracy on Blind test: 0.85 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [ 7.79554605 11.49424958 7.54219484 5.43912721 6.24815321 11.02255273 10.90592504 5.49640512 4.84995794 5.80157137] mean value: 7.6595683097839355 key: score_time value: [0.01336813 0.02104759 0.02115202 0.01333594 0.01317334 0.01572514 0.0206151 0.01981592 0.01315188 0.01340199] mean value: 0.016478705406188964 key: test_mcc value: [0.79079986 0.76065472 0.72861209 0.85004744 0.77352678 0.73267501 0.78932976 0.7800135 0.81855773 0.73029674] mean value: 0.7754513633073671 key: train_mcc value: [0.93970666 0.97816762 0.97821896 0.97649265 0.98319883 0.97482434 0.97816369 0.89872756 0.95975062 0.97502267] mean value: 0.964227359943727 key: test_accuracy value: [0.89473684 0.87969925 0.86363636 0.92424242 0.88636364 0.86363636 0.89393939 0.88636364 0.90909091 0.86363636] mean value: 0.8865345181134655 key: train_accuracy value: [0.96972246 0.98906644 0.98907563 0.98823529 0.99159664 0.98739496 0.98907563 0.94789916 0.97983193 0.98739496] mean value: 0.9819293099914482 key: test_fscore value: [0.890625 0.88405797 0.859375 0.92647059 0.88888889 0.87142857 0.89705882 0.89361702 0.90769231 0.86956522] mean value: 0.8888779389456867 key: train_fscore value: [0.96938776 0.9891031 0.98901099 0.98827471 0.99161074 0.9874477 0.98904802 0.94991922 0.97969543 0.98753117] mean value: 0.9821028837722166 key: test_precision value: [0.91935484 0.85915493 0.88709677 0.9 0.86956522 0.82432432 0.87142857 0.84 0.921875 0.83333333] mean value: 0.8726132988958224 key: train_precision value: [0.98106713 0.98497496 0.99489796 0.98497496 0.98994975 0.98333333 0.99155405 0.91446345 0.98637138 0.97697368] mean value: 0.9788560654162172 key: test_recall value: [0.86363636 0.91044776 0.83333333 0.95454545 0.90909091 0.92424242 0.92424242 0.95454545 0.89393939 0.90909091] mean value: 0.9077114427860696 key: train_recall value: [0.95798319 0.99326599 0.98319328 0.99159664 0.99327731 0.99159664 0.98655462 0.98823529 0.97310924 0.99831933] mean value: 0.985713153948448 key: test_roc_auc value: [0.89450475 0.8794663 0.86363636 0.92424242 0.88636364 0.86363636 0.89393939 0.88636364 0.90909091 0.86363636] mean value: 0.8864880144730891 key: train_roc_auc value: [0.96973234 0.98906997 0.98907563 0.98823529 0.99159664 0.98739496 0.98907563 0.94789916 0.97983193 0.98739496] mean value: 0.9819306510482981 key: test_jcc value: [0.8028169 0.79220779 0.75342466 0.8630137 0.8 0.7721519 0.81333333 0.80769231 0.83098592 0.76923077] mean value: 0.8004857274264172 key: train_jcc value: [0.94059406 0.97844113 0.97826087 0.97682119 0.98336106 0.97520661 0.97833333 0.90461538 0.960199 0.97536946] mean value: 0.9651202106233013 MCC on Blind test: 0.63 Accuracy on Blind test: 0.86 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.08770657 0.09637952 0.08765864 0.0985477 0.07765341 0.06359506 0.08068061 0.08904219 0.062922 0.09313583] mean value: 0.08373215198516845 key: score_time value: [0.01011205 0.00990009 0.00997949 0.00955868 0.00991607 0.00972962 0.00970531 0.00996614 0.00964093 0.0096693 ] mean value: 0.009817767143249511 key: test_mcc value: [0.81953867 0.70036445 0.86452993 0.81818182 0.84848485 0.80386117 0.84848485 0.80386117 0.89486432 0.77281598] mean value: 0.8174987205049051 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90977444 0.84962406 0.93181818 0.90909091 0.92424242 0.90151515 0.92424242 0.90151515 0.9469697 0.88636364] mean value: 0.9085156071998177 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90909091 0.85507246 0.93333333 0.90909091 0.92424242 0.9037037 0.92424242 0.9037037 0.94573643 0.88549618] mean value: 0.9093712488490158 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.90909091 0.83098592 0.91304348 0.90909091 0.92424242 0.88405797 0.92424242 0.88405797 0.96825397 0.89230769] mean value: 0.903937366301114 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.90909091 0.88059701 0.95454545 0.90909091 0.92424242 0.92424242 0.92424242 0.92424242 0.92424242 0.87878788] mean value: 0.9153324287652645 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90976934 0.84938942 0.93181818 0.90909091 0.92424242 0.90151515 0.92424242 0.90151515 0.9469697 0.88636364] mean value: 0.908491632745364 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.83333333 0.74683544 0.875 0.83333333 0.85915493 0.82432432 0.85915493 0.82432432 0.89705882 0.79452055] mean value: 0.8347039988982837 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.71 Accuracy on Blind test: 0.89 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.21911669 0.22973084 0.22298813 0.2171011 0.21578741 0.20552397 0.21279311 0.21100545 0.22396517 0.20665073] mean value: 0.21646625995635987 key: score_time value: [0.02211118 0.02221394 0.0213306 0.02232122 0.02003217 0.02073336 0.02134371 0.02109003 0.02177501 0.02143049] mean value: 0.021438169479370116 key: test_mcc value: [0.8046133 0.74631701 0.72760688 0.85004744 0.83419555 0.74456392 0.82425939 0.85201287 0.90909091 0.77352678] mean value: 0.8066234039020962 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90225564 0.87218045 0.86363636 0.92424242 0.91666667 0.87121212 0.90909091 0.92424242 0.95454545 0.88636364] mean value: 0.9024436090225564 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90225564 0.87769784 0.86153846 0.92647059 0.91851852 0.87591241 0.91428571 0.92753623 0.95454545 0.88888889] mean value: 0.9047649747479877 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.89552239 0.84722222 0.875 0.9 0.89855072 0.84507042 0.86486486 0.88888889 0.95454545 0.86956522] mean value: 0.8839230183145329 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.90909091 0.91044776 0.84848485 0.95454545 0.93939394 0.90909091 0.96969697 0.96969697 0.95454545 0.90909091] mean value: 0.9274084124830394 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90230665 0.87189055 0.86363636 0.92424242 0.91666667 0.87121212 0.90909091 0.92424242 0.95454545 0.88636364] mean value: 0.9024197195838988 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.82191781 0.78205128 0.75675676 0.8630137 0.84931507 0.77922078 0.84210526 0.86486486 0.91304348 0.8 ] mean value: 0.8272288999654913 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.59 Accuracy on Blind test: 0.84 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01580811 0.01548338 0.01368093 0.01516891 0.0141232 0.01596141 0.01456952 0.01595473 0.01479793 0.01596093] mean value: 0.015150904655456543 key: score_time value: [0.01032925 0.01032257 0.0101037 0.00945139 0.01024938 0.00988722 0.01068926 0.0100286 0.01042461 0.01069736] mean value: 0.010218334197998048 key: test_mcc value: [0.50384441 0.64007417 0.53085171 0.66697297 0.63900965 0.63753558 0.65521076 0.69825325 0.5992912 0.56067042] mean value: 0.6131714128542862 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.7518797 0.81954887 0.76515152 0.83333333 0.81818182 0.81818182 0.82575758 0.84848485 0.79545455 0.78030303] mean value: 0.8056277056277057 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.7518797 0.82608696 0.75968992 0.83076923 0.82608696 0.82352941 0.83453237 0.85294118 0.81118881 0.77862595] mean value: 0.8095330493264747 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.74626866 0.8028169 0.77777778 0.84375 0.79166667 0.8 0.79452055 0.82857143 0.75324675 0.78461538] mean value: 0.7923234116948085 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.75757576 0.85074627 0.74242424 0.81818182 0.86363636 0.84848485 0.87878788 0.87878788 0.87878788 0.77272727] mean value: 0.8290140208050656 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.75192221 0.81931253 0.76515152 0.83333333 0.81818182 0.81818182 0.82575758 0.84848485 0.79545455 0.78030303] mean value: 0.8056083220262324 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.60240964 0.7037037 0.6125 0.71052632 0.7037037 0.7 0.71604938 0.74358974 0.68235294 0.6375 ] mean value: 0.6812335429233362 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.34 Accuracy on Blind test: 0.74 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [3.97470117 3.93355489 6.33849072 7.47062922 6.05037212 5.72808099 4.41282606 5.67960405 6.03741074 6.04518557] mean value: 5.567085552215576 key: score_time value: [0.11290765 0.10394311 0.25654578 0.17098546 0.14105511 0.14017177 0.1155014 0.14093661 0.14221764 0.17594004] mean value: 0.15002045631408692 key: test_mcc value: [0.94028503 0.80667588 0.83573501 0.89486432 0.9251987 0.7431924 0.91076511 0.89404202 0.93939394 0.88040627] mean value: 0.8770558685927923 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.96992481 0.90225564 0.91666667 0.9469697 0.96212121 0.87121212 0.95454545 0.9469697 0.96969697 0.93939394] mean value: 0.9379756208703577 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97014925 0.90647482 0.91338583 0.94814815 0.96296296 0.87407407 0.95588235 0.94736842 0.96969697 0.94117647] mean value: 0.938931930011108 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95588235 0.875 0.95081967 0.92753623 0.94202899 0.85507246 0.92857143 0.94029851 0.96969697 0.91428571] mean value: 0.9259192326248543 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.94029851 0.87878788 0.96969697 0.98484848 0.89393939 0.98484848 0.95454545 0.96969697 0.96969697] mean value: 0.9531207598371778 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97003618 0.90196744 0.91666667 0.9469697 0.96212121 0.87121212 0.95454545 0.9469697 0.96969697 0.93939394] mean value: 0.9379579375848033 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94202899 0.82894737 0.84057971 0.90140845 0.92857143 0.77631579 0.91549296 0.9 0.94117647 0.88888889] mean value: 0.8863410050046168 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [2.18256736 3.10948396 3.41758657 2.72969723 2.87032175 2.7995379 2.65826178 3.18488312 2.70599651 3.31159854] mean value: 2.8969934701919557 key: score_time value: [0.23093939 0.23506045 0.21551728 0.20308757 0.21437836 0.17920065 0.23916888 0.19888854 0.24526668 0.1810782 ] mean value: 0.21425859928131102 key: test_mcc value: [0.91145917 0.77448543 0.83419555 0.89486432 0.92690611 0.78932976 0.89486432 0.84848485 0.92434853 0.86452993] mean value: 0.8663467976628745 key: train_mcc value: [0.94135097 0.95144787 0.94820052 0.94483811 0.94492358 0.94971397 0.94648053 0.94304896 0.94635215 0.94812549] mean value: 0.9464482143823427 key: test_accuracy value: [0.95488722 0.88721805 0.91666667 0.9469697 0.96212121 0.89393939 0.9469697 0.92424242 0.96212121 0.93181818] mean value: 0.9326953748006379 key: train_accuracy value: [0.9705635 0.97560976 0.97394958 0.97226891 0.97226891 0.97478992 0.97310924 0.97142857 0.97310924 0.97394958] mean value: 0.9731047204415828 key: test_fscore value: [0.95588235 0.88888889 0.91472868 0.94814815 0.96350365 0.89705882 0.94814815 0.92424242 0.96240602 0.93333333] mean value: 0.9336340466074704 key: train_fscore value: [0.97090607 0.97585346 0.97427386 0.97261411 0.97265949 0.975 0.97342193 0.97171381 0.97333333 0.97423109] mean value: 0.9734007136255515 key: test_precision value: [0.92857143 0.88235294 0.93650794 0.92753623 0.92957746 0.87142857 0.92753623 0.92424242 0.95522388 0.91304348] mean value: 0.9196020589341564 key: train_precision value: [0.96052632 0.96540362 0.96229508 0.96065574 0.95915033 0.96694215 0.96223317 0.96210873 0.96528926 0.96381579] mean value: 0.9628420181669508 key: test_recall value: [0.98484848 0.89552239 0.89393939 0.96969697 1. 0.92424242 0.96969697 0.92424242 0.96969697 0.95454545] mean value: 0.9486431478968792 key: train_recall value: [0.98151261 0.98653199 0.98655462 0.98487395 0.98655462 0.98319328 0.98487395 0.98151261 0.98151261 0.98487395] mean value: 0.9841994171405937 key: test_roc_auc value: [0.95511081 0.88715513 0.91666667 0.9469697 0.96212121 0.89393939 0.9469697 0.92424242 0.96212121 0.93181818] mean value: 0.9327114427860697 key: train_roc_auc value: [0.97055428 0.97561893 0.97394958 0.97226891 0.97226891 0.97478992 0.97310924 0.97142857 0.97310924 0.97394958] mean value: 0.9731047166341285 key: test_jcc value: [0.91549296 0.8 0.84285714 0.90140845 0.92957746 0.81333333 0.90140845 0.85915493 0.92753623 0.875 ] mean value: 0.8765768961595661 key: train_jcc value: [0.94345719 0.95284553 0.94983819 0.94668821 0.94677419 0.95121951 0.94822006 0.94498382 0.94805195 0.94975689] mean value: 0.9481835537416387 MCC on Blind test: 0.79 Accuracy on Blind test: 0.92 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.03519893 0.04723525 0.04286456 0.04674935 0.04282951 0.04792905 0.0430634 0.04757333 0.04246354 0.04807854] mean value: 0.04439854621887207 key: score_time value: [0.02588463 0.02797842 0.02426362 0.02864552 0.02428579 0.02721524 0.02426314 0.02780771 0.02645445 0.02699757] mean value: 0.026379609107971193 key: test_mcc value: [0.68631183 0.74631701 0.63636364 0.61313934 0.54950626 0.70511024 0.73267501 0.71220297 0.68568568 0.5611861 ] mean value: 0.6628498064589106 key: train_mcc value: [0.67197059 0.68369068 0.67740153 0.66050973 0.68212887 0.67495193 0.67082886 0.65736006 0.68191117 0.66546459] mean value: 0.6726218014876005 key: test_accuracy value: [0.84210526 0.87218045 0.81818182 0.8030303 0.77272727 0.84848485 0.86363636 0.85606061 0.84090909 0.78030303] mean value: 0.8297619047619047 key: train_accuracy value: [0.83515559 0.84020185 0.83781513 0.82857143 0.8394958 0.83613445 0.83445378 0.82689076 0.8394958 0.83109244] mean value: 0.8349307023061537 key: test_fscore value: [0.84671533 0.87769784 0.81818182 0.81690141 0.78571429 0.85915493 0.87142857 0.85714286 0.84892086 0.7751938 ] mean value: 0.8357051702448439 key: train_fscore value: [0.84090909 0.84751204 0.84347121 0.8368 0.8468324 0.84312148 0.84048583 0.83546326 0.84658635 0.83907126] mean value: 0.8420252907043897 key: test_precision value: [0.81690141 0.84722222 0.81818182 0.76315789 0.74324324 0.80263158 0.82432432 0.85074627 0.80821918 0.79365079] mean value: 0.8068278730496224 key: train_precision value: [0.81318681 0.80981595 0.81504702 0.79847328 0.80981595 0.80864198 0.8109375 0.79604262 0.81076923 0.80122324] mean value: 0.8073953585042138 key: test_recall value: [0.87878788 0.91044776 0.81818182 0.87878788 0.83333333 0.92424242 0.92424242 0.86363636 0.89393939 0.75757576] mean value: 0.8683175033921302 key: train_recall value: [0.87058824 0.88888889 0.87394958 0.8789916 0.88739496 0.88067227 0.87226891 0.8789916 0.88571429 0.88067227] mean value: 0.879813258636788 key: test_roc_auc value: [0.84237901 0.87189055 0.81818182 0.8030303 0.77272727 0.84848485 0.86363636 0.85606061 0.84090909 0.78030303] mean value: 0.8297602894617819 key: train_roc_auc value: [0.83512577 0.84024276 0.83781513 0.82857143 0.8394958 0.83613445 0.83445378 0.82689076 0.8394958 0.83109244] mean value: 0.8349318111082817 key: test_jcc value: [0.73417722 0.78205128 0.69230769 0.69047619 0.64705882 0.75308642 0.7721519 0.75 0.7375 0.63291139] mean value: 0.7191720914446778 key: train_jcc value: [0.7254902 0.73537604 0.72931276 0.71939477 0.73435327 0.72878999 0.72486034 0.71742112 0.73398329 0.72275862] mean value: 0.7271740398801881 MCC on Blind test: 0.59 Accuracy on Blind test: 0.83 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [3.90365171 2.85308623 2.88379312 2.87783575 2.93418789 2.8434577 2.931072 2.88599873 2.94633913 2.89332843] mean value: 2.995275068283081 key: score_time value: [0.01317334 0.01315236 0.01351786 0.013798 0.01476145 0.0138061 0.01398635 0.01382971 0.01496601 0.01291895] mean value: 0.01379101276397705 key: test_mcc value: [0.91145917 0.80667588 0.84848485 0.88040627 0.93982555 0.81855773 0.9701425 0.89486432 0.87919164 0.87919164] mean value: 0.8828799556470258 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95488722 0.90225564 0.92424242 0.93939394 0.96969697 0.90909091 0.98484848 0.9469697 0.93939394 0.93939394] mean value: 0.9410173160173161 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.95588235 0.90647482 0.92424242 0.94117647 0.97014925 0.91044776 0.98507463 0.94814815 0.93846154 0.94029851] mean value: 0.9420355903779138 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.92857143 0.875 0.92424242 0.91428571 0.95588235 0.89705882 0.97058824 0.92753623 0.953125 0.92647059] mean value: 0.9272760798983625 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.94029851 0.92424242 0.96969697 0.98484848 0.92424242 1. 0.96969697 0.92424242 0.95454545] mean value: 0.9576662143826323 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.95511081 0.90196744 0.92424242 0.93939394 0.96969697 0.90909091 0.98484848 0.9469697 0.93939394 0.93939394] mean value: 0.941010854816825 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.91549296 0.82894737 0.85915493 0.88888889 0.94202899 0.83561644 0.97058824 0.90140845 0.88405797 0.88732394] mean value: 0.8913508169172104 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.77 Accuracy on Blind test: 0.91 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.08217144 0.09075713 0.08338308 0.113235 0.12136722 0.08390951 0.07351661 0.0989933 0.09131002 0.10390306] mean value: 0.09425463676452636 key: score_time value: [0.01993656 0.02662683 0.01330376 0.01766443 0.02078724 0.01329708 0.01319885 0.01317883 0.02058554 0.02135062] mean value: 0.01799297332763672 key: test_mcc value: [0.7515073 0.72930801 0.73267501 0.74942473 0.75725927 0.77711043 0.80386117 0.78824078 0.80534465 0.76072577] mean value: 0.7655457131180822 key: train_mcc value: [0.83607958 0.84513414 0.8445445 0.8344302 0.8395983 0.84233833 0.83064344 0.8395983 0.83064344 0.84095802] mean value: 0.8383968239634668 key: test_accuracy value: [0.87218045 0.86466165 0.86363636 0.87121212 0.87121212 0.88636364 0.90151515 0.89393939 0.90151515 0.87878788] mean value: 0.8805023923444976 key: train_accuracy value: [0.91673675 0.92094197 0.9210084 0.91596639 0.91848739 0.92016807 0.91428571 0.91848739 0.91428571 0.91932773] mean value: 0.9179695528337491 key: test_fscore value: [0.87943262 0.86567164 0.87142857 0.87943262 0.88275862 0.89208633 0.9037037 0.89552239 0.90510949 0.88405797] mean value: 0.8859203964900466 key: train_fscore value: [0.91996766 0.92419355 0.92394822 0.91909385 0.92158448 0.92282697 0.91720779 0.92158448 0.91720779 0.92220421] mean value: 0.9209819008738551 key: test_precision value: [0.82666667 0.86567164 0.82432432 0.82666667 0.81012658 0.84931507 0.88405797 0.88235294 0.87323944 0.84722222] mean value: 0.8489643521253238 key: train_precision value: [0.88629283 0.8869969 0.89079563 0.88611544 0.88785047 0.89308176 0.88697017 0.88785047 0.88697017 0.89045383] mean value: 0.8883377690429243 key: test_recall value: [0.93939394 0.86567164 0.92424242 0.93939394 0.96969697 0.93939394 0.92424242 0.90909091 0.93939394 0.92424242] mean value: 0.9274762550881954 key: train_recall value: [0.95630252 0.96464646 0.95966387 0.95462185 0.95798319 0.95462185 0.94957983 0.95798319 0.94957983 0.95630252] mean value: 0.956128512010865 key: test_roc_auc value: [0.87268204 0.864654 0.86363636 0.87121212 0.87121212 0.88636364 0.90151515 0.89393939 0.90151515 0.87878788] mean value: 0.8805517865219358 key: train_roc_auc value: [0.91670345 0.92097869 0.9210084 0.91596639 0.91848739 0.92016807 0.91428571 0.91848739 0.91428571 0.91932773] mean value: 0.9179698950287186 key: test_jcc value: [0.78481013 0.76315789 0.7721519 0.78481013 0.79012346 0.80519481 0.82432432 0.81081081 0.82666667 0.79220779] mean value: 0.7954257902630099 key: train_jcc value: [0.85179641 0.85907046 0.85864662 0.8502994 0.85457271 0.85671192 0.84707646 0.85457271 0.84707646 0.8556391 ] mean value: 0.8535462253796596 MCC on Blind test: 0.63 Accuracy on Blind test: 0.85 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01841187 0.04239488 0.04338026 0.018224 0.0188179 0.01878238 0.01850367 0.01893735 0.01868582 0.01888943] mean value: 0.023502755165100097 key: score_time value: [0.02268887 0.02586532 0.02344608 0.01426363 0.01349759 0.01352072 0.01398182 0.01327801 0.0132823 0.01402259] mean value: 0.01678469181060791 key: test_mcc value: [0.5188958 0.66938237 0.56222174 0.54570516 0.59097693 0.72760688 0.65765844 0.63753558 0.73029674 0.62128344] mean value: 0.6261563059343562 key: train_mcc value: [0.63347346 0.65058239 0.62701829 0.63876463 0.63370295 0.65287677 0.61513387 0.60001356 0.64604365 0.61848827] mean value: 0.6316097837906691 key: test_accuracy value: [0.7593985 0.83458647 0.78030303 0.77272727 0.79545455 0.86363636 0.82575758 0.81818182 0.86363636 0.81060606] mean value: 0.8124287992709045 key: train_accuracy value: [0.81665265 0.82506308 0.81344538 0.81932773 0.81680672 0.82605042 0.80756303 0.8 0.82268908 0.8092437 ] mean value: 0.815684177792227 key: test_fscore value: [0.75384615 0.83823529 0.77165354 0.7761194 0.79389313 0.86567164 0.83687943 0.8125 0.86956522 0.81203008] mean value: 0.8130393891021387 key: train_fscore value: [0.81893688 0.82809917 0.81530782 0.82098251 0.81833333 0.83018868 0.80804694 0.80067002 0.82662284 0.80940386] mean value: 0.8176592060673311 key: test_precision value: [0.765625 0.82608696 0.80327869 0.76470588 0.8 0.85294118 0.78666667 0.83870968 0.83333333 0.80597015] mean value: 0.8077317530542945 key: train_precision value: [0.80952381 0.81331169 0.80724876 0.81353135 0.81157025 0.81089744 0.80602007 0.79799666 0.80868167 0.80872483] mean value: 0.8087506531449246 key: test_recall value: [0.74242424 0.85074627 0.74242424 0.78787879 0.78787879 0.87878788 0.89393939 0.78787879 0.90909091 0.81818182] mean value: 0.8199231117141564 key: train_recall value: [0.82857143 0.84343434 0.82352941 0.82857143 0.82521008 0.85042017 0.81008403 0.80336134 0.84537815 0.81008403] mean value: 0.8268644427467957 key: test_roc_auc value: [0.75927182 0.83446404 0.78030303 0.77272727 0.79545455 0.86363636 0.82575758 0.81818182 0.86363636 0.81060606] mean value: 0.8124038896426956 key: train_roc_auc value: [0.81664262 0.82507852 0.81344538 0.81932773 0.81680672 0.82605042 0.80756303 0.8 0.82268908 0.8092437 ] mean value: 0.8156847183317772 key: test_jcc value: [0.60493827 0.72151899 0.62820513 0.63414634 0.65822785 0.76315789 0.7195122 0.68421053 0.76923077 0.6835443 ] mean value: 0.686669226591934 key: train_jcc value: [0.69338959 0.70662906 0.68820225 0.69632768 0.69252468 0.70967742 0.67791842 0.66759777 0.70448179 0.67983075] mean value: 0.6916579410309931 MCC on Blind test: 0.54 Accuracy on Blind test: 0.81 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.03066039 0.04237843 0.0722537 0.05127764 0.03083706 0.04251885 0.03168774 0.0351162 0.02943254 0.03488851] mean value: 0.04010510444641113 key: score_time value: [0.01965117 0.01301241 0.01952147 0.01264167 0.01278877 0.01377964 0.01281571 0.01285934 0.0128963 0.012851 ] mean value: 0.014281749725341797 key: test_mcc value: [0.71848125 0.68595814 0.68723073 0.81442137 0.69293487 0.71319972 0.77521709 0.73576721 0.86452993 0.36155076] mean value: 0.7049291083404625 key: train_mcc value: [0.79940017 0.81798627 0.74027929 0.8033942 0.76201646 0.80154842 0.77356593 0.78546966 0.79651203 0.54421959] mean value: 0.7624392036940087 key: test_accuracy value: [0.85714286 0.84210526 0.82575758 0.90151515 0.83333333 0.84848485 0.87878788 0.86363636 0.93181818 0.65151515] mean value: 0.8434096605149237 key: train_accuracy value: [0.89823381 0.9058032 0.85882353 0.89747899 0.87394958 0.89495798 0.87815126 0.88571429 0.89579832 0.73445378] mean value: 0.8723364736979737 key: test_fscore value: [0.86330935 0.84892086 0.8496732 0.90909091 0.85333333 0.8630137 0.89041096 0.87323944 0.93333333 0.52083333] mean value: 0.8405158421686592 key: train_fscore value: [0.90249799 0.91125198 0.8742515 0.90438871 0.88496933 0.90317583 0.88973384 0.89554531 0.90127389 0.64414414] mean value: 0.8711232520757678 key: test_precision value: [0.82191781 0.81944444 0.74712644 0.84415584 0.76190476 0.7875 0.8125 0.81578947 0.91304348 0.83333333] mean value: 0.8156715580784251 key: train_precision value: [0.86687307 0.86077844 0.78812416 0.84728341 0.81382228 0.83764368 0.8125 0.82461103 0.85627837 0.97610922] mean value: 0.8484023648159316 key: test_recall value: [0.90909091 0.88059701 0.98484848 0.98484848 0.96969697 0.95454545 0.98484848 0.93939394 0.95454545 0.37878788] mean value: 0.8941203075531434 key: train_recall value: [0.94117647 0.96801347 0.98151261 0.9697479 0.9697479 0.97983193 0.98319328 0.97983193 0.9512605 0.48067227] mean value: 0.9204988257929434 key: test_roc_auc value: [0.85753053 0.84181366 0.82575758 0.90151515 0.83333333 0.84848485 0.87878788 0.86363636 0.93181818 0.65151515] mean value: 0.8434192672998643 key: train_roc_auc value: [0.89819766 0.90585547 0.85882353 0.89747899 0.87394958 0.89495798 0.87815126 0.88571429 0.89579832 0.73445378] mean value: 0.8723380867498515 key: test_jcc value: [0.75949367 0.7375 0.73863636 0.83333333 0.74418605 0.75903614 0.80246914 0.775 0.875 0.35211268] mean value: 0.7376767370804521 key: train_jcc value: [0.82232012 0.83697234 0.77659574 0.82546495 0.79367263 0.82344633 0.80136986 0.8108484 0.82028986 0.47508306] mean value: 0.778606328564591 MCC on Blind test: 0.48 Accuracy on Blind test: 0.81 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.03250432 0.04610515 0.04896641 0.05026364 0.07544041 0.0446434 0.05205488 0.07194495 0.04098344 0.04652524] mean value: 0.050943183898925784 key: score_time value: [0.01292706 0.01292372 0.01289201 0.01292682 0.01127601 0.01288795 0.01278996 0.01134944 0.01288676 0.0128274 ] mean value: 0.01256871223449707 key: test_mcc value: [0.42640275 0.76265475 0.70878358 0.83573501 0.62994079 0.67426617 0.70511024 0.76072577 0.78816781 0.4747723 ] mean value: 0.6766559176006244 key: train_mcc value: [0.46894069 0.82260467 0.80301916 0.84706002 0.68505868 0.76682222 0.77055923 0.82251813 0.73391183 0.68531986] mean value: 0.7405814492275309 key: test_accuracy value: [0.68421053 0.87969925 0.84848485 0.91666667 0.8030303 0.82575758 0.84848485 0.87878788 0.88636364 0.71969697] mean value: 0.8291182501708818 key: train_accuracy value: [0.68797309 0.91084945 0.89579832 0.92352941 0.82773109 0.87310924 0.88151261 0.91092437 0.85378151 0.82605042] mean value: 0.8591259514739453 key: test_fscore value: [0.57142857 0.875 0.86111111 0.91970803 0.77192982 0.84563758 0.83606557 0.87301587 0.89655172 0.65420561] mean value: 0.8104653898591715 key: train_fscore value: [0.55568862 0.90862069 0.90387597 0.9234651 0.79842675 0.8862095 0.87262873 0.90909091 0.87091988 0.79443893] mean value: 0.8423365062744236 key: test_precision value: [0.875 0.91803279 0.79487179 0.88732394 0.91666667 0.75903614 0.91071429 0.91666667 0.82278481 0.85365854] mean value: 0.8654755635756893 key: train_precision value: [0.96666667 0.93109541 0.83884892 0.92424242 0.96208531 0.80327869 0.94335938 0.92819615 0.77954847 0.97087379] mean value: 0.9048195196007951 key: test_recall value: [0.42424242 0.8358209 0.93939394 0.95454545 0.66666667 0.95454545 0.77272727 0.83333333 0.98484848 0.53030303] mean value: 0.7896426956128448 key: train_recall value: [0.38991597 0.88720539 0.97983193 0.92268908 0.68235294 0.98823529 0.81176471 0.8907563 0.98655462 0.67226891] mean value: 0.8211575135104547 key: test_roc_auc value: [0.68227047 0.88003166 0.84848485 0.91666667 0.8030303 0.82575758 0.84848485 0.87878788 0.88636364 0.71969697] mean value: 0.8289574853007688 key: train_roc_auc value: [0.68822398 0.91082958 0.89579832 0.92352941 0.82773109 0.87310924 0.88151261 0.91092437 0.85378151 0.82605042] mean value: 0.8591490535608183 key: test_jcc value: [0.4 0.77777778 0.75609756 0.85135135 0.62857143 0.73255814 0.71830986 0.77464789 0.8125 0.48611111] mean value: 0.6937925115801036 key: train_jcc value: [0.38474295 0.83254344 0.82461103 0.8578125 0.66448445 0.79566982 0.77403846 0.83333333 0.77135348 0.65897858] mean value: 0.739756806448993 MCC on Blind test: 0.65 Accuracy on Blind test: 0.84 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.43816185 0.42773724 0.43033075 0.41837001 0.42521048 0.42237544 0.42985082 0.42985892 0.45174265 0.4252367 ] mean value: 0.4298874855041504 key: score_time value: [0.01665282 0.01664472 0.0165534 0.01650286 0.01656866 0.01707387 0.01700234 0.01646709 0.0166471 0.01660824] mean value: 0.01667211055755615 key: test_mcc value: [0.86558065 0.78977162 0.86452993 0.8824419 0.86452993 0.78787879 0.92434853 0.833429 0.87919164 0.86452993] mean value: 0.8556231936727914 key: train_mcc value: [0.91951324 0.92281332 0.92138456 0.91780392 0.92795261 0.93618205 0.92974798 0.9363249 0.92130122 0.93156752] mean value: 0.9264591303705949 key: test_accuracy value: [0.93233083 0.89473684 0.93181818 0.93939394 0.93181818 0.89393939 0.96212121 0.91666667 0.93939394 0.93181818] mean value: 0.9274037366142629 key: train_accuracy value: [0.95962994 0.96131203 0.9605042 0.95882353 0.96386555 0.96806723 0.96470588 0.96806723 0.9605042 0.96554622] mean value: 0.9631026001653815 key: test_fscore value: [0.93333333 0.89705882 0.93023256 0.94202899 0.93333333 0.89393939 0.96183206 0.91729323 0.94029851 0.93333333] mean value: 0.9282683562729682 key: train_fscore value: [0.96013289 0.96166667 0.96106048 0.95920067 0.96425603 0.96822742 0.96517413 0.96838602 0.96099585 0.96608768] mean value: 0.9635187834058504 key: test_precision value: [0.91304348 0.88405797 0.95238095 0.90277778 0.91304348 0.89393939 0.96923077 0.91044776 0.92647059 0.91304348] mean value: 0.9178435648555319 key: train_precision value: [0.94909688 0.95214521 0.94771242 0.95049505 0.95394737 0.96339434 0.95253682 0.95881384 0.94918033 0.95114007] mean value: 0.9528462330084465 key: test_recall value: [0.95454545 0.91044776 0.90909091 0.98484848 0.95454545 0.89393939 0.95454545 0.92424242 0.95454545 0.95454545] mean value: 0.9395296246042515 key: train_recall value: [0.97142857 0.97138047 0.97478992 0.96806723 0.97478992 0.97310924 0.97815126 0.97815126 0.97310924 0.98151261] mean value: 0.974448971507795 key: test_roc_auc value: [0.93249661 0.89461782 0.93181818 0.93939394 0.93181818 0.89393939 0.96212121 0.91666667 0.93939394 0.93181818] mean value: 0.9274084124830394 key: train_roc_auc value: [0.95962001 0.96132049 0.9605042 0.95882353 0.96386555 0.96806723 0.96470588 0.96806723 0.9605042 0.96554622] mean value: 0.9631024531024531 key: test_jcc value: [0.875 0.81333333 0.86956522 0.89041096 0.875 0.80821918 0.92647059 0.84722222 0.88732394 0.875 ] mean value: 0.8667545441830428 key: train_jcc value: [0.92332268 0.92616372 0.92503987 0.9216 0.93097913 0.93841167 0.93269231 0.93870968 0.92492013 0.9344 ] mean value: 0.929623919553232 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.24893618 0.26400042 0.26318049 0.26846218 0.28313684 0.26294708 0.2789216 0.26224089 0.26190519 0.27044439] mean value: 0.2664175271987915 key: score_time value: [0.01890635 0.03807998 0.04070425 0.04163957 0.03528833 0.02002311 0.04702449 0.03652 0.03795028 0.02937794] mean value: 0.03455142974853516 key: test_mcc value: [0.89484396 0.73050958 0.78932976 0.85004744 0.94112395 0.80312249 0.92434853 0.84848485 0.89404202 0.88040627] mean value: 0.8556258862023336 key: train_mcc value: [0.98822677 0.99327725 0.99832074 0.99160924 0.99159804 0.99328292 1. 0.99495939 0.98992156 0.98823669] mean value: 0.9929432605655744 key: test_accuracy value: [0.94736842 0.86466165 0.89393939 0.92424242 0.96969697 0.90151515 0.96212121 0.92424242 0.9469697 0.93939394] mean value: 0.9274151287309182 key: train_accuracy value: [0.9941127 0.99663583 0.99915966 0.99579832 0.99579832 0.99663866 1. 0.99747899 0.99495798 0.99411765] mean value: 0.996469810800687 key: test_fscore value: [0.94736842 0.86956522 0.890625 0.92647059 0.97058824 0.90076336 0.96240602 0.92424242 0.94656489 0.94117647] mean value: 0.9279770616116411 key: train_fscore value: [0.99412259 0.99662732 0.99916037 0.99580889 0.99580185 0.9966443 1. 0.99748111 0.99496644 0.99412259] mean value: 0.9964735439198161 key: test_precision value: [0.94029851 0.84507042 0.91935484 0.9 0.94285714 0.90769231 0.95522388 0.92424242 0.95384615 0.91428571] mean value: 0.9202871392228333 key: train_precision value: [0.99328859 0.99831081 0.99832215 0.99331104 0.99496644 0.99497487 1. 0.9966443 0.99329983 0.99328859] mean value: 0.9956406621581875 key: test_recall value: [0.95454545 0.89552239 0.86363636 0.95454545 1. 0.89393939 0.96969697 0.92424242 0.93939394 0.96969697] mean value: 0.9365219357756671 key: train_recall value: [0.99495798 0.99494949 1. 0.99831933 0.99663866 0.99831933 1. 0.99831933 0.99663866 0.99495798] mean value: 0.9973100755453697 key: test_roc_auc value: [0.94742198 0.86442786 0.89393939 0.92424242 0.96969697 0.90151515 0.96212121 0.92424242 0.9469697 0.93939394] mean value: 0.92739710538218 key: train_roc_auc value: [0.99411199 0.99663441 0.99915966 0.99579832 0.99579832 0.99663866 1. 0.99747899 0.99495798 0.99411765] mean value: 0.9964695979401862 key: test_jcc value: [0.9 0.76923077 0.8028169 0.8630137 0.94285714 0.81944444 0.92753623 0.85915493 0.89855072 0.88888889] mean value: 0.8671493731559037 key: train_jcc value: [0.98831386 0.99327731 0.99832215 0.99165275 0.9916388 0.99331104 1. 0.99497487 0.98998331 0.98831386] mean value: 0.9929787938678081 MCC on Blind test: 0.69 Accuracy on Blind test: 0.88 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.91420841 0.84525871 0.94702196 0.84990621 1.60249829 0.84771895 1.92725825 0.90147996 1.74981856 0.9151895 ] mean value: 1.1500358819961547 key: score_time value: [0.05293345 0.03867912 0.0567255 0.05308461 0.05238461 0.05297732 0.05538249 0.05317044 0.05231237 0.06014085] mean value: 0.05277907848358154 key: test_mcc value: [0.72969918 0.74830832 0.63753558 0.77711043 0.71883597 0.72861209 0.73960026 0.82158384 0.75792383 0.72861209] mean value: 0.7387821583904562 key: train_mcc value: [0.93672036 0.94183394 0.95164901 0.94492358 0.9316729 0.94476334 0.93475087 0.94156086 0.94133735 0.93829364] mean value: 0.9407505841032879 key: test_accuracy value: [0.86466165 0.87218045 0.81818182 0.88636364 0.85606061 0.86363636 0.86363636 0.90909091 0.87878788 0.86363636] mean value: 0.8676236044657097 key: train_accuracy value: [0.96804037 0.9705635 0.97563025 0.97226891 0.96554622 0.97226891 0.96722689 0.97058824 0.97058824 0.96890756] mean value: 0.9701629078881342 key: test_fscore value: [0.86567164 0.87943262 0.82352941 0.89208633 0.86524823 0.86764706 0.875 0.91304348 0.88059701 0.86764706] mean value: 0.8729902846388133 key: train_fscore value: [0.96864686 0.97109827 0.97597349 0.97265949 0.96614368 0.97256858 0.96763485 0.97100249 0.97085762 0.9693962 ] mean value: 0.9705981520486012 key: test_precision value: [0.85294118 0.83783784 0.8 0.84931507 0.81333333 0.84285714 0.80769231 0.875 0.86764706 0.84285714] mean value: 0.8389481068365033 key: train_precision value: [0.95137763 0.95299838 0.9624183 0.95915033 0.94967532 0.96217105 0.9557377 0.95751634 0.9620462 0.95439739] mean value: 0.9567488661268432 key: test_recall value: [0.87878788 0.92537313 0.84848485 0.93939394 0.92424242 0.89393939 0.95454545 0.95454545 0.89393939 0.89393939] mean value: 0.910719131614654 key: train_recall value: [0.98655462 0.98989899 0.98991597 0.98655462 0.98319328 0.98319328 0.97983193 0.98487395 0.97983193 0.98487395] mean value: 0.9848722519310755 key: test_roc_auc value: [0.86476707 0.87177748 0.81818182 0.88636364 0.85606061 0.86363636 0.86363636 0.90909091 0.87878788 0.86363636] mean value: 0.8675938489371325 key: train_roc_auc value: [0.96802479 0.97057975 0.97563025 0.97226891 0.96554622 0.97226891 0.96722689 0.97058824 0.97058824 0.96890756] mean value: 0.9701629742806214 key: test_jcc value: [0.76315789 0.78481013 0.7 0.80519481 0.7625 0.76623377 0.77777778 0.84 0.78666667 0.76623377] mean value: 0.7752574803425902 key: train_jcc value: [0.9392 0.94382022 0.95307443 0.94677419 0.93450479 0.94660194 0.93729904 0.94363929 0.9433657 0.94060995] mean value: 0.9428889560478227 MCC on Blind test: 0.55 Accuracy on Blind test: 0.82 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [2.26034164 2.15656042 2.12556386 2.0825882 2.16070485 2.17109656 2.21937275 2.30224919 2.25836587 2.26037884] mean value: 2.1997222185134886 key: score_time value: [0.01029849 0.01075649 0.00986791 0.01148391 0.01252151 0.01258969 0.01346898 0.01144934 0.01176357 0.0115099 ] mean value: 0.011570978164672851 key: test_mcc value: [0.88134139 0.77857748 0.81855773 0.83573501 0.92690611 0.8196886 0.9701425 0.833429 0.90950859 0.85004744] mean value: 0.862393386471883 key: train_mcc value: [0.9849739 0.98160977 0.98328216 0.98158054 0.97997035 0.97660853 0.97655886 0.99329976 0.98664381 0.98498664] mean value: 0.9829514304345836 key: test_accuracy value: [0.93984962 0.88721805 0.90909091 0.91666667 0.96212121 0.90909091 0.98484848 0.91666667 0.95454545 0.92424242] mean value: 0.930434039644566 key: train_accuracy value: [0.99243061 0.99074853 0.99159664 0.9907563 0.98991597 0.98823529 0.98823529 0.99663866 0.99327731 0.99243697] mean value: 0.9914271579111039 key: test_fscore value: [0.94117647 0.89361702 0.90769231 0.91970803 0.96350365 0.91176471 0.98507463 0.91729323 0.95522388 0.92647059] mean value: 0.9321524513052296 key: train_fscore value: [0.99249374 0.99081036 0.99165275 0.99081036 0.99 0.98833333 0.98831386 0.99664992 0.9933222 0.99249374] mean value: 0.9914880272309861 key: test_precision value: [0.91428571 0.85135135 0.921875 0.88732394 0.92957746 0.88571429 0.97058824 0.91044776 0.94117647 0.9 ] mean value: 0.9112340226878438 key: train_precision value: [0.98509934 0.98341625 0.98507463 0.98504983 0.98181818 0.98016529 0.98175788 0.9933222 0.986733 0.98509934] mean value: 0.984753594200818 key: test_recall value: [0.96969697 0.94029851 0.89393939 0.95454545 1. 0.93939394 1. 0.92424242 0.96969697 0.95454545] mean value: 0.9546359113523293 key: train_recall value: [1. 0.9983165 0.99831933 0.99663866 0.99831933 0.99663866 0.99495798 1. 1. 1. ] mean value: 0.9983190447896331 key: test_roc_auc value: [0.94007237 0.88681592 0.90909091 0.91666667 0.96212121 0.90909091 0.98484848 0.91666667 0.95454545 0.92424242] mean value: 0.9304161013116237 key: train_roc_auc value: [0.99242424 0.99075489 0.99159664 0.9907563 0.98991597 0.98823529 0.98823529 0.99663866 0.99327731 0.99243697] mean value: 0.9914271567212743 key: test_jcc value: [0.88888889 0.80769231 0.83098592 0.85135135 0.92957746 0.83783784 0.97058824 0.84722222 0.91428571 0.8630137 ] mean value: 0.8741443636484267 key: train_jcc value: [0.98509934 0.98178808 0.98344371 0.98178808 0.98019802 0.97693575 0.97689769 0.9933222 0.986733 0.98509934] mean value: 0.9831305207536616 MCC on Blind test: 0.78 Accuracy on Blind test: 0.91 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.16132116 0.24572635 0.1239059 0.07294703 0.23971868 0.23406434 0.09096909 0.06589794 0.17431712 0.08550954] mean value: 0.14943771362304686 key: score_time value: [0.0302527 0.04182434 0.0523839 0.02149272 0.0493772 0.02764463 0.01641536 0.01400828 0.01939535 0.01256824] mean value: 0.028536272048950196 key: test_mcc value: [0.34042787 0.28728495 0.14547859 0.26352314 0.23664319 0.2466911 0.30151134 0.27050089 0.25400025 0.22882178] mean value: 0.25748831072069617 key: train_mcc value: [0.28121216 0.27899486 0.28775432 0.26665917 0.27204268 0.35622176 0.27382004 0.30618622 0.27910312 0.27558913] mean value: 0.287758344848468 key: test_accuracy value: [0.60150376 0.57894737 0.53030303 0.57575758 0.5530303 0.56818182 0.58333333 0.56818182 0.56060606 0.56818182] mean value: 0.5688026885395306 key: train_accuracy value: [0.57359125 0.57190917 0.57647059 0.56638655 0.56890756 0.61260504 0.5697479 0.58571429 0.57226891 0.57058824] mean value: 0.5768189496151699 key: test_fscore value: [0.71351351 0.70526316 0.67708333 0.69892473 0.69109948 0.69518717 0.70588235 0.6984127 0.69473684 0.69189189] mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") 0.6971995163490601 key: train_fscore value: [0.70123748 0.70005893 0.70247934 0.6975381 0.69876688 0.72077529 0.69917744 0.70707071 0.70041201 0.69958848] mean value: 0.7027104644570163 key: test_precision value: [0.55462185 0.54471545 0.51587302 0.54166667 0.528 0.53719008 0.54545455 0.53658537 0.53225806 0.53781513] mean value: 0.5374180162953031 key: train_precision value: [0.5399274 0.53853128 0.54140127 0.53555356 0.53700361 0.56344697 0.53748871 0.546875 0.53894928 0.53797468] mean value: 0.5417151759223713 key: test_recall value: [1. 1. 0.98484848 0.98484848 1. 0.98484848 1. 1. 1. 0.96969697] mean value: 0.9924242424242424 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.60447761 0.57575758 0.53030303 0.57575758 0.5530303 0.56818182 0.58333333 0.56818182 0.56060606 0.56818182] mean value: 0.5687810945273631 key: train_roc_auc value: [0.57323232 0.57226891 0.57647059 0.56638655 0.56890756 0.61260504 0.5697479 0.58571429 0.57226891 0.57058824] mean value: 0.57681903064256 key: test_jcc value: [0.55462185 0.54471545 0.51181102 0.53719008 0.528 0.53278689 0.54545455 0.53658537 0.53225806 0.52892562] mean value: 0.5352348883065589 key: train_jcc value: [0.5399274 0.53853128 0.54140127 0.53555356 0.53700361 0.56344697 0.53748871 0.546875 0.53894928 0.53797468] mean value: 0.5417151759223713 MCC on Blind test: 0.11 Accuracy on Blind test: 0.38 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02625012 0.03711963 0.0367558 0.04246736 0.06679559 0.04406571 0.03983307 0.07703924 0.02355957 0.05142474] mean value: 0.04453108310699463 key: score_time value: [0.07219577 0.03721213 0.03111935 0.02411985 0.03120399 0.02946353 0.03227925 0.03698516 0.02328372 0.03299832] mean value: 0.03508610725402832 key: test_mcc value: [0.748702 0.70036445 0.78368849 0.7800135 0.74420841 0.79373126 0.84119102 0.76072577 0.87919164 0.73029674] mean value: 0.7762113288716164 key: train_mcc value: [0.81479237 0.81790967 0.81158 0.81564917 0.81538413 0.81558668 0.80973991 0.82018755 0.82058077 0.81282825] mean value: 0.8154238490138148 key: test_accuracy value: [0.87218045 0.84962406 0.88636364 0.88636364 0.86363636 0.89393939 0.91666667 0.87878788 0.93939394 0.86363636] mean value: 0.8850592390066074 key: train_accuracy value: [0.90496215 0.90664424 0.90336134 0.90588235 0.90588235 0.90504202 0.90252101 0.90756303 0.90840336 0.90420168] mean value: 0.9054463534783131 key: test_fscore value: [0.87769784 0.85507246 0.8951049 0.89361702 0.87671233 0.9 0.92198582 0.88405797 0.94029851 0.86956522] mean value: 0.891411206211467 key: train_fscore value: [0.90996016 0.91127098 0.90836653 0.91025641 0.91011236 0.91024623 0.90749601 0.91242038 0.91259022 0.90894569] mean value: 0.9101664971757291 key: test_precision value: [0.83561644 0.83098592 0.83116883 0.84 0.8 0.85135135 0.86666667 0.84722222 0.92647059 0.83333333] mean value: 0.8462815346826821 key: train_precision value: [0.86515152 0.86757991 0.86363636 0.86983155 0.87096774 0.86295181 0.86342944 0.86686838 0.87269939 0.86605784] mean value: 0.8669173928283019 key: test_recall value: [0.92424242 0.88059701 0.96969697 0.95454545 0.96969697 0.95454545 0.98484848 0.92424242 0.95454545 0.90909091] mean value: 0.9426051560379919 key: train_recall value: [0.95966387 0.95959596 0.95798319 0.95462185 0.95294118 0.96302521 0.95630252 0.96302521 0.95630252 0.95630252] mean value: 0.957976402682285 key: test_roc_auc value: [0.87256897 0.84938942 0.88636364 0.88636364 0.86363636 0.89393939 0.91666667 0.87878788 0.93939394 0.86363636] mean value: 0.8850746268656716 key: train_roc_auc value: [0.90491611 0.90668874 0.90336134 0.90588235 0.90588235 0.90504202 0.90252101 0.90756303 0.90840336 0.90420168] mean value: 0.9054461986814928 key: test_jcc value: [0.78205128 0.74683544 0.81012658 0.80769231 0.7804878 0.81818182 0.85526316 0.79220779 0.88732394 0.76923077] mean value: 0.8049400901115182 key: train_jcc value: [0.83479532 0.83700441 0.83211679 0.83529412 0.83505155 0.83527697 0.83065693 0.83894583 0.83923304 0.83308931] mean value: 0.8351464258960671 MCC on Blind test: 0.68 Accuracy on Blind test: 0.87 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:136: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:139: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.98454618 0.67307448 0.86623049 0.59880996 0.63273072 0.63469291 0.69719315 0.63847065 0.83392954 0.6913631 ] mean value: 0.7251041173934937 key: score_time value: [0.06120563 0.03548265 0.03454709 0.03712463 0.02196646 0.04404831 0.02621794 0.02708268 0.03657389 0.02833247] mean value: 0.03525817394256592 key: test_mcc value: [0.748702 0.71431805 0.76072577 0.7800135 0.78816781 0.79373126 0.80758535 0.78932976 0.84887469 0.74456392] mean value: 0.777601212041837 key: train_mcc value: [0.81479237 0.84539007 0.8465731 0.81564917 0.83512819 0.81558668 0.82970909 0.83719788 0.83285454 0.83465317] mean value: 0.8307534248261067 key: test_accuracy value: [0.87218045 0.85714286 0.87878788 0.88636364 0.88636364 0.89393939 0.90151515 0.89393939 0.92424242 0.87121212] mean value: 0.8865686944634312 key: train_accuracy value: [0.90496215 0.92094197 0.92184874 0.90588235 0.91596639 0.90504202 0.91344538 0.91680672 0.91512605 0.91596639] mean value: 0.9135988154723622 key: test_fscore value: [0.87769784 0.85925926 0.88405797 0.89361702 0.89655172 0.9 0.90647482 0.89705882 0.92537313 0.87591241] mean value: 0.8916003004175677 key: train_fscore value: [0.90996016 0.92431562 0.92493947 0.91025641 0.9194847 0.91024623 0.91686844 0.92048193 0.91835085 0.91922456] mean value: 0.9174128360722797 key: test_precision value: [0.83561644 0.85294118 0.84722222 0.84 0.82278481 0.85135135 0.8630137 0.87142857 0.91176471 0.84507042] mean value: 0.8541193397003181 key: train_precision value: [0.86515152 0.88580247 0.88975155 0.86983155 0.88253478 0.86295181 0.88198758 0.88153846 0.8847352 0.88491446] mean value: 0.8789199372030476 key: test_recall value: [0.92424242 0.86567164 0.92424242 0.95454545 0.98484848 0.95454545 0.95454545 0.92424242 0.93939394 0.90909091] mean value: 0.9335368611488014 key: train_recall value: [0.95966387 0.96632997 0.96302521 0.95462185 0.95966387 0.96302521 0.95462185 0.96302521 0.95462185 0.95630252] mean value: 0.9594901394901395 key: test_roc_auc value: [0.87256897 0.85707825 0.87878788 0.88636364 0.88636364 0.89393939 0.90151515 0.89393939 0.92424242 0.87121212] mean value: 0.8866010854816825 key: train_roc_auc value: [0.90491611 0.92098011 0.92184874 0.90588235 0.91596639 0.90504202 0.91344538 0.91680672 0.91512605 0.91596639] mean value: 0.9135980250686133 key: test_jcc value: [0.78205128 0.75324675 0.79220779 0.80769231 0.8125 0.81818182 0.82894737 0.81333333 0.86111111 0.77922078] mean value: 0.804849254546623 key: train_jcc value: [0.83479532 0.85928144 0.86036036 0.83529412 0.8509687 0.83527697 0.84649776 0.85267857 0.8490284 0.85052317] mean value: 0.8474704813594193 MCC on Blind test: 0.66 Accuracy on Blind test: 0.86 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.12481308 0.16899014 0.2003262 0.20602155 0.16480041 0.11521244 0.11547208 0.13901997 0.20162272 0.07334018] mean value: 0.15096187591552734 key: score_time value: [0.02007031 0.01953554 0.03566623 0.05057263 0.02884102 0.03782344 0.01230049 0.01505136 0.0135622 0.01324034] mean value: 0.024666357040405273 key: test_mcc value: [0.69961549 0.60900045 0.68568568 0.79373126 0.74420841 0.69986771 0.75897093 0.75897093 0.833429 0.74456392] mean value: 0.7328043779458789 key: train_mcc value: [0.76766604 0.78845283 0.76980875 0.76610944 0.76595745 0.77672743 0.75440121 0.77176593 0.75793243 0.77079477] mean value: 0.7689616274532732 key: test_accuracy value: [0.84962406 0.80451128 0.84090909 0.89393939 0.86363636 0.84848485 0.87878788 0.87878788 0.91666667 0.87121212] mean value: 0.8646559580770107 key: train_accuracy value: [0.88309504 0.89318755 0.88403361 0.88235294 0.88235294 0.88739496 0.87647059 0.88487395 0.87815126 0.88487395] mean value: 0.8836786792092783 key: test_fscore value: [0.85074627 0.80597015 0.84892086 0.9 0.87671233 0.85507246 0.88235294 0.88235294 0.91729323 0.87591241] mean value: 0.8695333597949811 key: train_fscore value: [0.88671557 0.89683184 0.88780488 0.8858075 0.88562092 0.89123377 0.8801956 0.88888889 0.8820179 0.88779689] mean value: 0.8872913750285026 key: test_precision value: [0.83823529 0.80597015 0.80821918 0.85135135 0.8 0.81944444 0.85714286 0.85714286 0.91044776 0.84507042] mean value: 0.8393024315264321 key: train_precision value: [0.86075949 0.86656201 0.85984252 0.86053883 0.86168521 0.86185243 0.85443038 0.85893417 0.85488959 0.8658147 ] mean value: 0.8605309333357611 key: test_recall value: [0.86363636 0.80597015 0.89393939 0.95454545 0.96969697 0.89393939 0.90909091 0.90909091 0.92424242 0.90909091] mean value: 0.9033242876526458 key: train_recall value: [0.91428571 0.92929293 0.91764706 0.91260504 0.91092437 0.92268908 0.90756303 0.9210084 0.91092437 0.91092437] mean value: 0.9157864357864358 key: test_roc_auc value: [0.84972863 0.80450023 0.84090909 0.89393939 0.86363636 0.84848485 0.87878788 0.87878788 0.91666667 0.87121212] mean value: 0.8646653098145636 key: train_roc_auc value: [0.88306878 0.89321789 0.88403361 0.88235294 0.88235294 0.88739496 0.87647059 0.88487395 0.87815126 0.88487395] mean value: 0.8836790877967348 key: test_jcc value: [0.74025974 0.675 0.7375 0.81818182 0.7804878 0.74683544 0.78947368 0.78947368 0.84722222 0.77922078] mean value: 0.7703655176221637 key: train_jcc value: [0.79648609 0.81296024 0.79824561 0.79502196 0.79472141 0.80380673 0.7860262 0.8 0.78893741 0.7982327 ] mean value: 0.7974438350039706 MCC on Blind test: 0.69 Accuracy on Blind test: 0.88 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [3.47399139 3.63057613 3.74403596 3.35848165 3.5461297 2.97885227 3.33522224 3.00847769 3.41326523 3.35739779] mean value: 3.384643006324768 key: score_time value: [0.01660132 0.02039504 0.02370691 0.02178669 0.0171504 0.03919339 0.01123929 0.02096009 0.02390099 0.02181339] mean value: 0.021674752235412598 key: test_mcc value: [0.74440174 0.67051692 0.66943868 0.76320314 0.74420841 0.72760688 0.74250948 0.80312249 0.80534465 0.71285802] mean value: 0.7383210399826687 key: train_mcc value: [0.83036998 0.84573107 0.82558796 0.82425731 0.77542098 0.81049737 0.80032673 0.78134411 0.80732609 0.82067938] mean value: 0.8121540966146651 key: test_accuracy value: [0.87218045 0.83458647 0.83333333 0.87878788 0.86363636 0.86363636 0.87121212 0.90151515 0.90151515 0.85606061] mean value: 0.8676463886990202 key: train_accuracy value: [0.91505467 0.92262405 0.91260504 0.91176471 0.88739496 0.90504202 0.9 0.88991597 0.90336134 0.91008403] mean value: 0.9057846788841692 key: test_fscore value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.87022901 0.83076923 0.84057971 0.88571429 0.87671233 0.86567164 0.87218045 0.90225564 0.90510949 0.85925926] mean value: 0.8708481043356118 key: train_fscore value: [0.91618257 0.92384106 0.91390728 0.91358025 0.88962109 0.90653433 0.90140845 0.89323553 0.90519373 0.91164327] mean value: 0.9075147566196161 key: test_precision value: [0.87692308 0.85714286 0.80555556 0.83783784 0.8 0.85294118 0.86567164 0.89552239 0.87323944 0.84057971] mean value: 0.8505413680545307 key: train_precision value: [0.90491803 0.90879479 0.9004894 0.89516129 0.8723748 0.89250814 0.88888889 0.86708861 0.88834951 0.8961039 ] mean value: 0.8914677356328867 key: test_recall value: [0.86363636 0.80597015 0.87878788 0.93939394 0.96969697 0.87878788 0.87878788 0.90909091 0.93939394 0.87878788] mean value: 0.8942333785617368 key: train_recall value: [0.92773109 0.93939394 0.92773109 0.93277311 0.90756303 0.9210084 0.91428571 0.9210084 0.92268908 0.92773109] mean value: 0.9241914947797301 key: test_roc_auc value: [0.87211669 0.83480326 0.83333333 0.87878788 0.86363636 0.86363636 0.87121212 0.90151515 0.90151515 0.85606061] mean value: 0.8676616915422886 key: train_roc_auc value: [0.915044 0.92263815 0.91260504 0.91176471 0.88739496 0.90504202 0.9 0.88991597 0.90336134 0.91008403] mean value: 0.9057850210791387 key: test_jcc value: [0.77027027 0.71052632 0.725 0.79487179 0.7804878 0.76315789 0.77333333 0.82191781 0.82666667 0.75324675] mean value: 0.7719478642012361 key: train_jcc value: [0.84532925 0.85846154 0.84146341 0.84090909 0.80118694 0.8290469 0.82051282 0.80706922 0.82680723 0.83763278] mean value: 0.8308419181684118 MCC on Blind test: 0.66 Accuracy on Blind test: 0.86 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.0162425 0.0160017 0.01585889 0.01603508 0.01597643 0.01602697 0.01607943 0.01623201 0.01599336 0.01613903] mean value: 0.01605854034423828 key: score_time value: [0.01126313 0.01120281 0.01129889 0.01132226 0.01130223 0.01134753 0.01124859 0.01127553 0.01130271 0.01125908] mean value: 0.01128227710723877 key: test_mcc value: [0.42990719 0.37538744 0.42919754 0.44066028 0.53183137 0.49306684 0.43944438 0.42600643 0.5768179 0.50471461] mean value: 0.4647033999224429 key: train_mcc value: [0.49601743 0.51214528 0.4853481 0.4943284 0.49286649 0.48000922 0.48289321 0.50592018 0.48083138 0.48305298] mean value: 0.49134126650583104 key: test_accuracy value: [0.71428571 0.68421053 0.71212121 0.71969697 0.76515152 0.74242424 0.71969697 0.71212121 0.78787879 0.75 ] mean value: 0.7307587149692413 key: train_accuracy value: [0.74684609 0.75441548 0.74117647 0.74621849 0.74537815 0.73865546 0.74033613 0.75210084 0.7394958 0.74033613] mean value: 0.7444959043331378 key: test_fscore value: [0.6984127 0.6557377 0.68852459 0.70866142 0.75590551 0.71666667 0.71755725 0.6984127 0.78125 0.73170732] mean value: 0.7152835856689457 key: train_fscore value: [0.73433363 0.73928571 0.72597865 0.73462214 0.73303965 0.72404614 0.72727273 0.74145486 0.72759227 0.72679045] mean value: 0.7314416230885522 key: test_precision value: [0.73333333 0.72727273 0.75 0.73770492 0.78688525 0.7962963 0.72307692 0.73333333 0.80645161 0.78947368] mean value: 0.7583828074360791 key: train_precision value: [0.7732342 0.78707224 0.77126654 0.76979742 0.77037037 0.76691729 0.76579926 0.77472527 0.76243094 0.76679104] mean value: 0.770840458530029 key: test_recall value: [0.66666667 0.59701493 0.63636364 0.68181818 0.72727273 0.65151515 0.71212121 0.66666667 0.75757576 0.68181818] mean value: 0.6778833107191315 key: train_recall value: [0.69915966 0.6969697 0.68571429 0.70252101 0.69915966 0.68571429 0.69243697 0.71092437 0.69579832 0.6907563 ] mean value: 0.6959154570919277 key: test_roc_auc value: [0.71393035 0.6848711 0.71212121 0.71969697 0.76515152 0.74242424 0.71969697 0.71212121 0.78787879 0.75 ] mean value: 0.7307892356399819 key: train_roc_auc value: [0.74688623 0.7543672 0.74117647 0.74621849 0.74537815 0.73865546 0.74033613 0.75210084 0.7394958 0.74033613] mean value: 0.7444950909656792 key: test_jcc value: [0.53658537 0.48780488 0.525 0.54878049 0.60759494 0.55844156 0.55952381 0.53658537 0.64102564 0.57692308] mean value: 0.5578265120183923 key: train_jcc value: [0.58019526 0.58640227 0.5698324 0.58055556 0.57858136 0.5674548 0.57142857 0.58913649 0.5718232 0.57083333] mean value: 0.5766243242866349 MCC on Blind test: 0.41 Accuracy on Blind test: 0.76 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01833224 0.02844119 0.03023386 0.03021145 0.04428649 0.03315878 0.05487323 0.02842569 0.03093243 0.02116919] mean value: 0.032006454467773435 key: score_time value: [0.01621819 0.02138519 0.02097058 0.02414799 0.02312398 0.03966141 0.02990365 0.03438926 0.02296233 0.01327443] mean value: 0.02460370063781738 key: test_mcc value: [0.5339213 0.53558614 0.44562679 0.57575758 0.67161876 0.60717674 0.57602211 0.60633906 0.63753558 0.59097693] mean value: 0.5780560993827096 key: train_mcc value: [0.62994548 0.60817175 0.60363763 0.60684612 0.62694743 0.62016807 0.6103634 0.60178352 0.6067364 0.58220454] mean value: 0.6096804333287658 key: test_accuracy value: [0.76691729 0.76691729 0.71969697 0.78787879 0.83333333 0.8030303 0.78787879 0.8030303 0.81818182 0.79545455] mean value: 0.7882319434951014 key: train_accuracy value: [0.81497056 0.80403701 0.80168067 0.80336134 0.81344538 0.81008403 0.80504202 0.80084034 0.80336134 0.7907563 ] mean value: 0.8047578997957467 key: test_fscore value: [0.76691729 0.75968992 0.69421488 0.78787879 0.84285714 0.796875 0.78461538 0.8 0.82352941 0.79389313] mean value: 0.7850470948633774 key: train_fscore value: [0.81481481 0.80203908 0.79863481 0.80135823 0.81469115 0.81008403 0.80204778 0.79898219 0.80269815 0.78552972] mean value: 0.8030879959995846 key: test_precision value: [0.76119403 0.79032258 0.76363636 0.78787879 0.7972973 0.82258065 0.796875 0.8125 0.8 0.8 ] mean value: 0.7932284704469647 key: train_precision value: [0.81618887 0.80960549 0.81109185 0.80960549 0.8092869 0.81008403 0.81455806 0.80650685 0.80541455 0.80565371] mean value: 0.8097995804820648 key: test_recall value: [0.77272727 0.73134328 0.63636364 0.78787879 0.89393939 0.77272727 0.77272727 0.78787879 0.84848485 0.78787879] mean value: 0.779194934418815 key: train_recall value: [0.81344538 0.79461279 0.78655462 0.79327731 0.82016807 0.81008403 0.78991597 0.79159664 0.8 0.76638655] mean value: 0.7966041366041366 key: test_roc_auc value: [0.76696065 0.76718679 0.71969697 0.78787879 0.83333333 0.8030303 0.78787879 0.8030303 0.81818182 0.79545455] mean value: 0.7882632293080054 key: train_roc_auc value: [0.81497185 0.80402909 0.80168067 0.80336134 0.81344538 0.81008403 0.80504202 0.80084034 0.80336134 0.7907563 ] mean value: 0.8047572362278245 key: test_jcc value: [0.62195122 0.6125 0.53164557 0.65 0.72839506 0.66233766 0.64556962 0.66666667 0.7 0.65822785] mean value: 0.6477293648219603 key: train_jcc value: [0.6875 0.66950355 0.66477273 0.66855524 0.68732394 0.68079096 0.66951567 0.66525424 0.67042254 0.64680851] mean value: 0.671044737093254 MCC on Blind test: 0.52 Accuracy on Blind test: 0.81 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.02491474 0.01506805 0.01512551 0.01483035 0.01469612 0.01485538 0.01491857 0.0151217 0.01491117 0.0147326 ] mean value: 0.015917420387268066 key: score_time value: [0.03499436 0.02949691 0.02866745 0.02961278 0.03052759 0.02934623 0.02845168 0.03219891 0.03506231 0.02963805] mean value: 0.03079962730407715 key: test_mcc value: [0.53140339 0.6557787 0.42443734 0.67445327 0.56855832 0.5992912 0.66943868 0.67445327 0.64109064 0.50709255] mean value: 0.5945997369934125 key: train_mcc value: [0.75352205 0.7313818 0.7451251 0.72677403 0.73661389 0.74492321 0.74481758 0.72071332 0.72352842 0.7393187 ] mean value: 0.7366718091955822 key: test_accuracy value: [0.7593985 0.82706767 0.71212121 0.83333333 0.78030303 0.79545455 0.83333333 0.83333333 0.81818182 0.75 ] mean value: 0.794252677147414 key: train_accuracy value: [0.87384357 0.86291001 0.8697479 0.85882353 0.86470588 0.86890756 0.8697479 0.85630252 0.85630252 0.86638655] mean value: 0.8647677944180195 key: test_fscore value: [0.78082192 0.83453237 0.71641791 0.84507042 0.7972028 0.81118881 0.84057971 0.84507042 0.82857143 0.76923077] mean value: 0.8068686563765856 key: train_fscore value: [0.88132911 0.87073751 0.87727633 0.86915888 0.8735271 0.87735849 0.87708168 0.8663018 0.86774942 0.87470449] mean value: 0.8735224811615062 key: test_precision value: [0.7125 0.80555556 0.70588235 0.78947368 0.74025974 0.75324675 0.80555556 0.78947368 0.78378378 0.71428571] mean value: 0.7600016824049332 key: train_precision value: [0.83258595 0.82308846 0.82934132 0.80986938 0.820059 0.82422452 0.83033033 0.80994152 0.80372493 0.82344214] mean value: 0.8206607530876882 key: test_recall value: [0.86363636 0.86567164 0.72727273 0.90909091 0.86363636 0.87878788 0.87878788 0.90909091 0.87878788 0.83333333] mean value: 0.8608095884215288 key: train_recall value: [0.93613445 0.92424242 0.93109244 0.93781513 0.93445378 0.93781513 0.92941176 0.93109244 0.94285714 0.93277311] mean value: 0.9337687802393685 key: test_roc_auc value: [0.76017639 0.82677521 0.71212121 0.83333333 0.78030303 0.79545455 0.83333333 0.83333333 0.81818182 0.75 ] mean value: 0.7943012211668928 key: train_roc_auc value: [0.87379113 0.86296155 0.8697479 0.85882353 0.86470588 0.86890756 0.8697479 0.85630252 0.85630252 0.86638655] mean value: 0.8647677050618228 key: test_jcc value: [0.64044944 0.71604938 0.55813953 0.73170732 0.6627907 0.68235294 0.725 0.73170732 0.70731707 0.625 ] mean value: 0.678051370196998 key: train_jcc value: [0.78783593 0.77106742 0.78138223 0.76859504 0.77545328 0.78151261 0.78107345 0.76413793 0.76639344 0.77731092] mean value: 0.7754762238935481 MCC on Blind test: 0.45 Accuracy on Blind test: 0.76 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.09510469 0.10395908 0.10734057 0.10678625 0.10629964 0.1047132 0.14021826 0.16393042 0.11608148 0.1044817 ] mean value: 0.11489152908325195 key: score_time value: [0.02618408 0.03990245 0.03856707 0.0683701 0.03701854 0.03856254 0.04088354 0.03897285 0.03920102 0.03909039] mean value: 0.04067525863647461 key: test_mcc value: [0.613804 0.6557787 0.5992912 0.72635073 0.73125738 0.66943868 0.73960026 0.71417356 0.77711043 0.73267501] mean value: 0.6959479952127847 key: train_mcc value: [0.72363501 0.74493969 0.75019202 0.73374129 0.73069778 0.73406597 0.72761525 0.73620188 0.7226944 0.75068681] mean value: 0.7354470114857837 key: test_accuracy value: [0.80451128 0.82706767 0.79545455 0.85606061 0.85606061 0.83333333 0.86363636 0.85606061 0.88636364 0.86363636] mean value: 0.8442185007974482 key: train_accuracy value: [0.85870479 0.86963835 0.87142857 0.86386555 0.86302521 0.86386555 0.8605042 0.86386555 0.85798319 0.87226891] mean value: 0.8645149868189497 key: test_fscore value: [0.81428571 0.83453237 0.81118881 0.86896552 0.8707483 0.84057971 0.875 0.86131387 0.89208633 0.87142857] mean value: 0.8540129197258242 key: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( train_fscore value: [0.86750789 0.87708168 0.87981147 0.87203791 0.87032617 0.87223975 0.86929134 0.8734375 0.86703383 0.87993681] mean value: 0.8728704351424548 key: test_precision value: [0.77027027 0.80555556 0.75324675 0.79746835 0.79012346 0.80555556 0.80769231 0.83098592 0.84931507 0.82432432] mean value: 0.8034537561851378 key: train_precision value: [0.81723626 0.82908546 0.8259587 0.82265276 0.82628399 0.82169391 0.81777778 0.81605839 0.81508876 0.83010432] mean value: 0.8221940319020319 key: test_recall value: [0.86363636 0.86567164 0.87878788 0.95454545 0.96969697 0.87878788 0.95454545 0.89393939 0.93939394 0.92424242] mean value: 0.9123247399366803 key: train_recall value: [0.92436975 0.93097643 0.94117647 0.92773109 0.91932773 0.92941176 0.92773109 0.9394958 0.92605042 0.93613445] mean value: 0.9302405002405003 key: test_roc_auc value: [0.80495251 0.82677521 0.79545455 0.85606061 0.85606061 0.83333333 0.86363636 0.85606061 0.88636364 0.86363636] mean value: 0.8442333785617367 key: train_roc_auc value: [0.85864952 0.8696899 0.87142857 0.86386555 0.86302521 0.86386555 0.8605042 0.86386555 0.85798319 0.87226891] mean value: 0.8645146139263786 key: test_jcc value: [0.68674699 0.71604938 0.68235294 0.76829268 0.77108434 0.725 0.77777778 0.75641026 0.80519481 0.7721519 ] mean value: 0.7461061070237571 key: train_jcc value: [0.76601671 0.78107345 0.78541374 0.77310924 0.77042254 0.77342657 0.76880223 0.77531207 0.76527778 0.78561354] mean value: 0.7744467869457157 MCC on Blind test: 0.66 Accuracy on Blind test: 0.86 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [ 5.61930633 10.23130774 10.04514098 6.87647343 3.24669886 5.83537579 5.23547888 5.2183423 6.56729603 11.27240038] mean value: 7.014782071113586 key: score_time value: [0.01726365 0.02066183 0.01314092 0.01353931 0.01514196 0.0191195 0.02220035 0.02619338 0.02870178 0.0352571 ] mean value: 0.021121978759765625 key: test_mcc value: [0.86558065 0.70694867 0.84119102 0.84515425 0.87177979 0.83806027 0.89901011 0.8824419 0.89486432 0.82773811] mean value: 0.84727690965409 key: train_mcc value: [0.97990346 0.96843712 0.95708143 0.93488809 0.92357126 0.98823669 0.93468125 0.97315872 0.9412097 0.98162491] mean value: 0.9582792637063065 key: test_accuracy value: [0.93233083 0.84962406 0.91666667 0.91666667 0.93181818 0.91666667 0.9469697 0.93939394 0.9469697 0.90909091] mean value: 0.9206197311460469 key: train_accuracy value: [0.98990749 0.98402019 0.97815126 0.96638655 0.96134454 0.99411765 0.96638655 0.98655462 0.97058824 0.9907563 ] mean value: 0.97882133845969 key: test_fscore value: [0.93333333 0.86111111 0.92198582 0.92307692 0.93617021 0.92086331 0.94964029 0.94202899 0.94573643 0.91549296] mean value: 0.9249439370374717 key: train_fscore value: [0.98998331 0.98423237 0.9785832 0.96747967 0.96217105 0.9941127 0.96742671 0.98662207 0.97046414 0.99082569] mean value: 0.9791900900647359 key: test_precision value: [0.91304348 0.80519481 0.86666667 0.85714286 0.88 0.87671233 0.90410959 0.90277778 0.96825397 0.85526316] mean value: 0.88291646289999 key: train_precision value: [0.98341625 0.9705401 0.95961228 0.93700787 0.94202899 0.99494949 0.93838863 0.98169717 0.97457627 0.98344371] mean value: 0.966566075938182 key: test_recall value: [0.95454545 0.92537313 0.98484848 1. 1. 0.96969697 1. 0.98484848 0.92424242 0.98484848] mean value: 0.9728403437358661 key: train_recall value: [0.99663866 0.9983165 0.99831933 1. 0.98319328 0.99327731 0.99831933 0.99159664 0.96638655 0.99831933] mean value: 0.9924366918484565 key: test_roc_auc value: [0.93249661 0.8490502 0.91666667 0.91666667 0.93181818 0.91666667 0.9469697 0.93939394 0.9469697 0.90909091] mean value: 0.9205789235639983 key: train_roc_auc value: [0.98990182 0.9840322 0.97815126 0.96638655 0.96134454 0.99411765 0.96638655 0.98655462 0.97058824 0.9907563 ] mean value: 0.978821973233738 key: test_jcc value: [0.875 0.75609756 0.85526316 0.85714286 0.88 0.85333333 0.90410959 0.89041096 0.89705882 0.84415584] mean value: 0.8612572124976998 key: train_jcc value: [0.98016529 0.96895425 0.95806452 0.93700787 0.92709984 0.98829431 0.93690852 0.97359736 0.94262295 0.98181818] mean value: 0.9594533093393642 MCC on Blind test: 0.65 Accuracy on Blind test: 0.86 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.09097266 0.06637001 0.06557441 0.06874728 0.0701592 0.07333708 0.06844783 0.07955551 0.06623769 0.07119703] mean value: 0.07205986976623535 key: score_time value: [0.01410508 0.0142591 0.01380372 0.0135386 0.01349068 0.01329613 0.01343465 0.0135026 0.01350117 0.01335311] mean value: 0.013628482818603516 key: test_mcc value: [0.85122361 0.81953867 0.92690611 0.87177979 0.91076511 0.8824419 0.95553309 0.89901011 0.93982555 0.9251987 ] mean value: 0.8982222631451187 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.92481203 0.90977444 0.96212121 0.93181818 0.95454545 0.93939394 0.97727273 0.9469697 0.96969697 0.96212121] mean value: 0.9478525860104807 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.92647059 0.91044776 0.96350365 0.93617021 0.95588235 0.94202899 0.97777778 0.94964029 0.97014925 0.96296296] mean value: 0.9495033832520609 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9 0.91044776 0.92957746 0.88 0.92857143 0.90277778 0.95652174 0.90410959 0.95588235 0.94202899] mean value: 0.9209917098951922 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.95454545 0.91044776 1. 1. 0.98484848 0.98484848 1. 1. 0.98484848 0.98484848] mean value: 0.9804387155133424 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.92503392 0.90976934 0.96212121 0.93181818 0.95454545 0.93939394 0.97727273 0.9469697 0.96969697 0.96212121] mean value: 0.9478742650384442 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8630137 0.83561644 0.92957746 0.88 0.91549296 0.89041096 0.95652174 0.90410959 0.94202899 0.92857143] mean value: 0.9045343260675828 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.89 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.31542921 0.28832293 0.30350065 0.30793715 0.28838658 0.33395696 0.19791198 0.19763827 0.20193744 0.20983505] mean value: 0.26448562145233157 key: score_time value: [0.02723169 0.02744985 0.02742767 0.0272634 0.02694702 0.02197075 0.01971865 0.02128482 0.02003503 0.02088046] mean value: 0.024020934104919435 key: test_mcc value: [0.88011764 0.84996625 0.82158384 0.89486432 0.8824419 0.86612538 0.92690611 0.91287093 0.95465504 0.88040627] mean value: 0.8869937674866786 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.93984962 0.92481203 0.90909091 0.9469697 0.93939394 0.93181818 0.96212121 0.95454545 0.97727273 0.93939394] mean value: 0.9425267714741399 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94029851 0.92647059 0.91304348 0.94814815 0.94202899 0.93430657 0.96350365 0.95652174 0.97709924 0.94117647] mean value: 0.9442597372952238 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.92647059 0.91304348 0.875 0.92753623 0.90277778 0.90140845 0.92957746 0.91666667 0.98461538 0.91428571] mean value: 0.9191381757218723 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.95454545 0.94029851 0.95454545 0.96969697 0.98484848 0.96969697 1. 1. 0.96969697 0.96969697] mean value: 0.971302578018996 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.93995929 0.92469471 0.90909091 0.9469697 0.93939394 0.93181818 0.96212121 0.95454545 0.97727273 0.93939394] mean value: 0.9425260063319765 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.88732394 0.8630137 0.84 0.90140845 0.89041096 0.87671233 0.92957746 0.91666667 0.95522388 0.88888889] mean value: 0.894922628160887 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.49 Accuracy on Blind test: 0.81 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01355791 0.01356125 0.01352477 0.01341796 0.01332808 0.01341581 0.01773357 0.01977444 0.02038264 0.01979041] mean value: 0.015848684310913085 key: score_time value: [0.00930381 0.00938654 0.00943065 0.00930262 0.00963712 0.00926065 0.01308894 0.01308584 0.01314425 0.0130682 ] mean value: 0.010870862007141113 key: test_mcc value: [0.82915052 0.74830832 0.7800135 0.85201287 0.85478752 0.78816781 0.82425939 0.82773811 0.78816781 0.81060226] mean value: 0.8103208098401626 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90977444 0.87218045 0.88636364 0.92424242 0.92424242 0.88636364 0.90909091 0.90909091 0.88636364 0.90151515] mean value: 0.9009227614490772 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.91549296 0.87943262 0.89361702 0.92753623 0.92857143 0.89655172 0.91428571 0.91549296 0.89655172 0.90780142] mean value: 0.9075333802339808 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85526316 0.83783784 0.84 0.88888889 0.87837838 0.82278481 0.86486486 0.85526316 0.82278481 0.85333333] mean value: 0.8519399239345942 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.92537313 0.95454545 0.96969697 0.98484848 0.98484848 0.96969697 0.98484848 0.98484848 0.96969697] mean value: 0.9713251922207147 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.91033469 0.87177748 0.88636364 0.92424242 0.92424242 0.88636364 0.90909091 0.90909091 0.88636364 0.90151515] mean value: 0.9009384893713251 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.84415584 0.78481013 0.80769231 0.86486486 0.86666667 0.8125 0.84210526 0.84415584 0.8125 0.83116883] mean value: 0.8310619748444532 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.38 Accuracy on Blind test: 0.76 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [4.82775021 4.74955702 3.40215826 3.2394557 3.25225616 3.24565363 3.2456162 3.36915302 3.59266257 3.57445192] mean value: 3.6498714685440063 key: score_time value: [0.14048791 0.14100409 0.10365963 0.10351729 0.1028018 0.11075878 0.11101532 0.11172509 0.10809517 0.1088829 ] mean value: 0.11419479846954346 key: test_mcc value: [0.94028503 0.85299767 0.93982555 0.92690611 0.94112395 0.85478752 0.94112395 0.94112395 0.95465504 0.89651574] mean value: 0.9189344490412299 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.96992481 0.92481203 0.96969697 0.96212121 0.96969697 0.92424242 0.96969697 0.96969697 0.97727273 0.9469697 ] mean value: 0.9584130781499203 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97014925 0.92857143 0.97014925 0.96350365 0.97058824 0.92857143 0.97058824 0.97058824 0.97744361 0.94890511] mean value: 0.959905843863454 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95588235 0.89041096 0.95588235 0.92957746 0.94285714 0.87837838 0.94285714 0.94285714 0.97014925 0.91549296] mean value: 0.9324345148002824 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.97014925 0.98484848 1. 1. 0.98484848 1. 1. 0.98484848 0.98484848] mean value: 0.9894391677973767 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97003618 0.92446857 0.96969697 0.96212121 0.96969697 0.92424242 0.96969697 0.96969697 0.97727273 0.9469697 ] mean value: 0.95838986883763 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94202899 0.86666667 0.94202899 0.92957746 0.94285714 0.86666667 0.94285714 0.94285714 0.95588235 0.90277778] mean value: 0.9234200328426941 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.76 Accuracy on Blind test: 0.91 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.3209312 1.44431853 1.58112407 1.51067591 1.5859437 2.19948959 1.38442206 1.93691373 1.28188467 1.27489114] mean value: 1.5520594596862793 key: score_time value: [0.15015864 0.19255185 0.17973113 0.18013835 0.17181015 0.19955444 0.27024174 0.23727822 0.20012379 0.24910641] mean value: 0.20306947231292724 key: test_mcc value: [0.89567796 0.80667588 0.89404202 0.92690611 0.94112395 0.83806027 0.89486432 0.90950859 0.93982555 0.88040627] mean value: 0.892709092074466 key: train_mcc value: [0.95517845 0.96007108 0.94492358 0.95674042 0.9502296 0.9484901 0.94492358 0.94686596 0.94860815 0.94524429] mean value: 0.9501275201585805 key: test_accuracy value: [0.94736842 0.90225564 0.9469697 0.96212121 0.96969697 0.91666667 0.9469697 0.95454545 0.96969697 0.93939394] mean value: 0.9455684666210982 key: train_accuracy value: [0.97729184 0.97981497 0.97226891 0.97815126 0.97478992 0.97394958 0.97226891 0.97310924 0.97394958 0.97226891] mean value: 0.9747863114968442 key: test_fscore value: [0.94814815 0.90647482 0.94736842 0.96350365 0.97058824 0.92086331 0.94814815 0.95522388 0.97014925 0.94117647] mean value: 0.9471644336691079 key: train_fscore value: [0.97770438 0.9800995 0.97265949 0.97847682 0.97524752 0.97440132 0.97265949 0.97359736 0.97444353 0.97279472] mean value: 0.9752084130865094 key: test_precision value: [0.92753623 0.875 0.94029851 0.92957746 0.94285714 0.87671233 0.92753623 0.94117647 0.95588235 0.91428571] mean value: 0.9230862445458927 key: train_precision value: [0.96103896 0.96568627 0.95915033 0.96411093 0.95786062 0.95779221 0.95915033 0.95623987 0.95631068 0.95469256] mean value: 0.9592032749258542 key: test_recall value: [0.96969697 0.94029851 0.95454545 1. 1. 0.96969697 0.96969697 0.96969697 0.98484848 0.96969697] mean value: 0.9728177295341475 key: train_recall value: [0.99495798 0.99494949 0.98655462 0.99327731 0.99327731 0.99159664 0.98655462 0.99159664 0.99327731 0.99159664] mean value: 0.9917638570579748 key: test_roc_auc value: [0.94753505 0.90196744 0.9469697 0.96212121 0.96969697 0.91666667 0.9469697 0.95454545 0.96969697 0.93939394] mean value: 0.9455563093622795 key: train_roc_auc value: [0.97727697 0.97982769 0.97226891 0.97815126 0.97478992 0.97394958 0.97226891 0.97310924 0.97394958 0.97226891] mean value: 0.9747860962566846 key: test_jcc value: [0.90140845 0.82894737 0.9 0.92957746 0.94285714 0.85333333 0.90140845 0.91428571 0.94202899 0.88888889] mean value: 0.9002735799490561 key: train_jcc value: [0.95638126 0.96097561 0.94677419 0.95786062 0.95169082 0.95008052 0.94677419 0.94855305 0.95016077 0.9470305 ] mean value: 0.9516281533345908 MCC on Blind test: 0.78 Accuracy on Blind test: 0.91 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.06151128 0.05858254 0.06312013 0.07686377 0.06964469 0.07349968 0.05664515 0.03685856 0.06115079 0.06200194] mean value: 0.06198785305023193 key: score_time value: [0.03485608 0.03934264 0.03880668 0.03443718 0.03964758 0.03490686 0.02829528 0.05180097 0.02917194 0.03911948] mean value: 0.0370384693145752 key: test_mcc value: [0.5339213 0.53558614 0.44562679 0.57575758 0.67161876 0.60717674 0.57602211 0.60633906 0.63753558 0.59097693] mean value: 0.5780560993827096 key: train_mcc value: [0.62994548 0.60817175 0.60363763 0.60684612 0.62694743 0.62016807 0.6103634 0.60178352 0.6067364 0.58220454] mean value: 0.6096804333287658 key: test_accuracy value: [0.76691729 0.76691729 0.71969697 0.78787879 0.83333333 0.8030303 0.78787879 0.8030303 0.81818182 0.79545455] mean value: 0.7882319434951014 key: train_accuracy value: [0.81497056 0.80403701 0.80168067 0.80336134 0.81344538 0.81008403 0.80504202 0.80084034 0.80336134 0.7907563 ] mean value: 0.8047578997957467 key: test_fscore value: [0.76691729 0.75968992 0.69421488 0.78787879 0.84285714 0.796875 0.78461538 0.8 0.82352941 0.79389313] mean value: 0.7850470948633774 key: train_fscore value: [0.81481481 0.80203908 0.79863481 0.80135823 0.81469115 0.81008403 0.80204778 0.79898219 0.80269815 0.78552972] mean value: 0.8030879959995846 key: test_precision value: [0.76119403 0.79032258 0.76363636 0.78787879 0.7972973 0.82258065 0.796875 0.8125 0.8 0.8 ] mean value: 0.7932284704469647 key: train_precision value: [0.81618887 0.80960549 0.81109185 0.80960549 0.8092869 0.81008403 0.81455806 0.80650685 0.80541455 0.80565371] mean value: 0.8097995804820648 key: test_recall value: [0.77272727 0.73134328 0.63636364 0.78787879 0.89393939 0.77272727 0.77272727 0.78787879 0.84848485 0.78787879] mean value: 0.779194934418815 key: train_recall value: [0.81344538 0.79461279 0.78655462 0.79327731 0.82016807 0.81008403 0.78991597 0.79159664 0.8 0.76638655] mean value: 0.7966041366041366 key: test_roc_auc value: [0.76696065 0.76718679 0.71969697 0.78787879 0.83333333 0.8030303 0.78787879 0.8030303 0.81818182 0.79545455] mean value: 0.7882632293080054 key: train_roc_auc value: [0.81497185 0.80402909 0.80168067 0.80336134 0.81344538 0.81008403 0.80504202 0.80084034 0.80336134 0.7907563 ] mean value: 0.8047572362278245 key: test_jcc value: [0.62195122 0.6125 0.53164557 0.65 0.72839506 0.66233766 0.64556962 0.66666667 0.7 0.65822785] mean value: 0.6477293648219603 key: train_jcc value: [0.6875 0.66950355 0.66477273 0.66855524 0.68732394 0.68079096 0.66951567 0.66525424 0.67042254 0.64680851] mean value: 0.671044737093254 MCC on Blind test: 0.52 Accuracy on Blind test: 0.81 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [9.41463685 8.58698535 8.8384304 9.8259964 7.32652545 2.57784414 5.06993341 9.8626914 8.55897713 8.41459346] mean value: 7.847661399841309 key: score_time value: [0.02014756 0.02379441 0.02771115 0.01952624 0.01420593 0.01656413 0.02535224 0.03231311 0.02562571 0.01690316] mean value: 0.022214365005493165 key: test_mcc value: [0.91355192 0.86703475 0.94112395 0.91287093 0.94112395 0.85478752 0.9701425 0.91287093 0.93982555 0.91076511] mean value: 0.9164097097925172 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95488722 0.93233083 0.96969697 0.95454545 0.96969697 0.92424242 0.98484848 0.95454545 0.96969697 0.95454545] mean value: 0.9569036226930964 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.95652174 0.9352518 0.97058824 0.95652174 0.97058824 0.92857143 0.98507463 0.95652174 0.97014925 0.95588235] mean value: 0.9585671148650311 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.91666667 0.90277778 0.94285714 0.91666667 0.94285714 0.87837838 0.97058824 0.91666667 0.95588235 0.92857143] mean value: 0.9271912458677165 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.97014925 1. 1. 1. 0.98484848 1. 1. 0.98484848 0.98484848] mean value: 0.9924694708276798 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.95522388 0.93204432 0.96969697 0.95454545 0.96969697 0.92424242 0.98484848 0.95454545 0.96969697 0.95454545] mean value: 0.9569086386250565 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.91666667 0.87837838 0.94285714 0.91666667 0.94285714 0.86666667 0.97058824 0.91666667 0.94202899 0.91549296] mean value: 0.9208869509307174 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.76 Accuracy on Blind test: 0.91 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.14503193 0.15870619 0.1729548 0.17024684 0.18737078 0.19274426 0.13104177 0.13881731 0.1534431 0.18702292] mean value: 0.16373798847198487 key: score_time value: [0.03879404 0.02698874 0.03794503 0.02734399 0.03862238 0.02691483 0.02693462 0.02665114 0.01282382 0.02664685] mean value: 0.028966546058654785 key: test_mcc value: [0.68430574 0.65422886 0.66943868 0.76642417 0.77521709 0.68378319 0.74250948 0.69986771 0.7431924 0.67161876] mean value: 0.7090586089861183 key: train_mcc value: [0.80866657 0.81346009 0.8086448 0.79321598 0.79555584 0.79465334 0.77715689 0.78250788 0.78328462 0.80270163] mean value: 0.795984763671047 key: test_accuracy value: [0.84210526 0.82706767 0.83333333 0.87878788 0.87878788 0.84090909 0.87121212 0.84848485 0.87121212 0.83333333] mean value: 0.8525233538391433 key: train_accuracy value: [0.90328007 0.9058032 0.90336134 0.89579832 0.89663866 0.89663866 0.88823529 0.8907563 0.8907563 0.90084034] mean value: 0.8972108473330459 key: test_fscore value: [0.84210526 0.82706767 0.84057971 0.88732394 0.89041096 0.84671533 0.87218045 0.85507246 0.87407407 0.84285714] mean value: 0.8578387005336142 key: train_fscore value: [0.90673155 0.90879479 0.90658002 0.8990228 0.90040486 0.89959184 0.89053498 0.89344262 0.89430894 0.90327869] mean value: 0.9002691083913814 key: test_precision value: [0.8358209 0.83333333 0.80555556 0.82894737 0.8125 0.81690141 0.86567164 0.81944444 0.85507246 0.7972973 ] mean value: 0.8270544408583936 key: train_precision value: [0.87617555 0.88012618 0.87735849 0.87203791 0.86875 0.87460317 0.87258065 0.872 0.86614173 0.8816 ] mean value: 0.8741373688860552 key: test_recall value: [0.84848485 0.82089552 0.87878788 0.95454545 0.98484848 0.87878788 0.87878788 0.89393939 0.89393939 0.89393939] mean value: 0.8926956128448665 key: train_recall value: [0.9394958 0.93939394 0.93781513 0.92773109 0.93445378 0.92605042 0.9092437 0.91596639 0.92436975 0.92605042] mean value: 0.9280570409982175 key: test_roc_auc value: [0.84215287 0.82711443 0.83333333 0.87878788 0.87878788 0.84090909 0.87121212 0.84848485 0.87121212 0.83333333] mean value: 0.8525327905924921 key: train_roc_auc value: [0.90324958 0.90583142 0.90336134 0.89579832 0.89663866 0.89663866 0.88823529 0.8907563 0.8907563 0.90084034] mean value: 0.8972106216223864 key: test_jcc value: [0.72727273 0.70512821 0.725 0.79746835 0.80246914 0.73417722 0.77333333 0.74683544 0.77631579 0.72839506] mean value: 0.7516395265397042 key: train_jcc value: [0.82937685 0.83283582 0.82912333 0.81656805 0.81885125 0.81750742 0.80267062 0.80740741 0.80882353 0.82361734] mean value: 0.8186781620728142 MCC on Blind test: 0.63 Accuracy on Blind test: 0.85 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02964282 0.03476715 0.04506683 0.04617333 0.05880022 0.04431558 0.04388618 0.04419994 0.04364896 0.04388213] mean value: 0.04343831539154053 key: score_time value: [0.01396203 0.0264771 0.0231936 0.02345204 0.02391791 0.02365637 0.02127576 0.02268791 0.02417183 0.02144814] mean value: 0.02242426872253418 key: test_mcc value: [0.56460751 0.55326567 0.4598545 0.50144101 0.62128344 0.56222174 0.54645907 0.57815159 0.68252363 0.59097693] mean value: 0.5660785084147357 key: train_mcc value: [0.56986488 0.58646496 0.57833507 0.55833105 0.57858369 0.58860201 0.55339154 0.57663023 0.5750832 0.55191644] mean value: 0.5717203080182092 key: test_accuracy value: [0.78195489 0.77443609 0.72727273 0.75 0.81060606 0.78030303 0.77272727 0.78787879 0.84090909 0.79545455] mean value: 0.7821542492595124 key: train_accuracy value: [0.78469302 0.79310345 0.78907563 0.7789916 0.78907563 0.79411765 0.77647059 0.78823529 0.78739496 0.77563025] mean value: 0.7856788064258504 key: test_fscore value: [0.78518519 0.76190476 0.70491803 0.74015748 0.81203008 0.77165354 0.77941176 0.77777778 0.84444444 0.79389313] mean value: 0.7771376195385946 key: train_fscore value: [0.78044597 0.78974359 0.78638298 0.77502139 0.78491859 0.79041916 0.77186964 0.78571429 0.78394535 0.77002584] mean value: 0.7818486790915893 key: test_precision value: [0.76811594 0.81355932 0.76785714 0.7704918 0.80597015 0.80327869 0.75714286 0.81666667 0.82608696 0.8 ] mean value: 0.79291695283083 key: train_precision value: [0.79684764 0.80208333 0.79655172 0.78919861 0.8006993 0.80487805 0.78809107 0.79518072 0.796875 0.78975265] mean value: 0.7960158090319096 key: test_recall value: [0.8030303 0.71641791 0.65151515 0.71212121 0.81818182 0.74242424 0.8030303 0.74242424 0.86363636 0.78787879] mean value: 0.7640660334690186 key: train_recall value: [0.76470588 0.77777778 0.77647059 0.76134454 0.7697479 0.77647059 0.75630252 0.77647059 0.77142857 0.7512605 ] mean value: 0.7681979458450047 key: test_roc_auc value: [0.78211217 0.77487562 0.72727273 0.75 0.81060606 0.78030303 0.77272727 0.78787879 0.84090909 0.79545455] mean value: 0.7822139303482587 key: train_roc_auc value: [0.78470984 0.79309057 0.78907563 0.7789916 0.78907563 0.79411765 0.77647059 0.78823529 0.78739496 0.77563025] mean value: 0.7856792009733187 key: test_jcc value: [0.64634146 0.61538462 0.5443038 0.5875 0.6835443 0.62820513 0.63855422 0.63636364 0.73076923 0.65822785] mean value: 0.6369194240371803 key: train_jcc value: [0.63994374 0.65254237 0.64796634 0.63268156 0.64598025 0.65346535 0.62849162 0.64705882 0.64466292 0.62605042] mean value: 0.6418843403318552 MCC on Blind test: 0.53 Accuracy on Blind test: 0.81 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.03986382 0.07160902 0.06057954 0.05892277 0.07447052 0.06678772 0.06090355 0.05563188 0.0627687 0.07016468] mean value: 0.062170219421386716 key: score_time value: [0.02108192 0.02115917 0.02505922 0.02454782 0.02111268 0.02110505 0.02623105 0.03131795 0.02457404 0.02104902] mean value: 0.023723793029785157 key: test_mcc value: [0.56157905 0.537956 0.61702991 0.60234703 0.69293487 0.64379631 0.66776115 0.59133807 0.67426617 0.7800135 ] mean value: 0.6369022051810874 key: train_mcc value: [0.64621096 0.70065565 0.70049556 0.66869353 0.75909618 0.75437781 0.68454294 0.66327631 0.66120137 0.74719859] mean value: 0.6985748896957277 key: test_accuracy value: [0.76691729 0.7593985 0.79545455 0.77272727 0.83333333 0.81818182 0.81818182 0.78030303 0.82575758 0.88636364] mean value: 0.8056618819776714 key: train_accuracy value: [0.80824222 0.83347351 0.84453782 0.81428571 0.87815126 0.87478992 0.82605042 0.81764706 0.82184874 0.86890756] mean value: 0.838793421489706 key: test_fscore value: [0.72072072 0.78947368 0.76106195 0.8125 0.85333333 0.83098592 0.84210526 0.73873874 0.8 0.89361702] mean value: 0.8042536623833423 key: train_fscore value: [0.77470356 0.85547445 0.82917821 0.84134961 0.88315874 0.8814638 0.84901532 0.78704612 0.79886148 0.87850467] mean value: 0.8378755963279767 key: test_precision value: [0.88888889 0.70588235 0.91489362 0.69148936 0.76190476 0.77631579 0.74418605 0.91111111 0.93877551 0.84 ] mean value: 0.8173447439758736 key: train_precision value: [0.94004796 0.75515464 0.92008197 0.73433584 0.84829721 0.83685801 0.75 0.94575472 0.91721133 0.81857765] mean value: 0.8466319322006147 key: test_recall value: [0.60606061 0.89552239 0.65151515 0.98484848 0.96969697 0.89393939 0.96969697 0.62121212 0.6969697 0.95454545] mean value: 0.824400723654455 key: train_recall value: [0.65882353 0.98653199 0.75462185 0.98487395 0.9210084 0.93109244 0.97815126 0.67394958 0.70756303 0.94789916] mean value: 0.8544515179809298 key: test_roc_auc value: [0.76571687 0.75836725 0.79545455 0.77272727 0.83333333 0.81818182 0.81818182 0.78030303 0.82575758 0.88636364] mean value: 0.8054387155133425 key: train_roc_auc value: [0.80836799 0.83360213 0.84453782 0.81428571 0.87815126 0.87478992 0.82605042 0.81764706 0.82184874 0.86890756] mean value: 0.8388188608776844 key: test_jcc value: [0.56338028 0.65217391 0.61428571 0.68421053 0.74418605 0.71084337 0.72727273 0.58571429 0.66666667 0.80769231] mean value: 0.6756425842686714 key: train_jcc value: [0.63225806 0.74744898 0.70820189 0.72614622 0.79076479 0.78805121 0.73764259 0.64886731 0.66508689 0.78333333] mean value: 0.7227801277927314 MCC on Blind test: 0.27 Accuracy on Blind test: 0.76 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.05123115 0.0788691 0.09172297 0.07682514 0.06595588 0.0708189 0.06753492 0.07790565 0.0734241 0.08075595] mean value: 0.07350437641143799 key: score_time value: [0.02121162 0.02126455 0.021348 0.02149343 0.0214448 0.02145123 0.02145767 0.02133083 0.02132106 0.02133274] mean value: 0.021365594863891602 key: test_mcc value: [0.69085775 0.65190379 0.55436714 0.72760688 0.71417356 0.44226898 0.75792383 0.57835174 0.45716359 0.75725927] mean value: 0.6331876526299302 key: train_mcc value: [0.77641553 0.75718089 0.55341157 0.76342639 0.75572242 0.57306029 0.77079477 0.54488981 0.50295754 0.72800226] mean value: 0.6725861465309467 key: test_accuracy value: [0.84210526 0.81203008 0.74242424 0.86363636 0.85606061 0.6969697 0.87878788 0.75757576 0.68181818 0.87121212] mean value: 0.8002620186830713 key: train_accuracy value: [0.88477712 0.87384357 0.74537815 0.87983193 0.87647059 0.75630252 0.88487395 0.73865546 0.71092437 0.85378151] mean value: 0.820483917705013 key: test_fscore value: [0.85106383 0.7826087 0.66 0.86153846 0.86131387 0.60784314 0.88059701 0.68627451 0.54347826 0.88275862] mean value: 0.7617476399134425 key: train_fscore value: [0.89204098 0.86288848 0.66885246 0.87356322 0.87093942 0.68614719 0.88779689 0.65559247 0.60277136 0.86917293] mean value: 0.78697653961389 key: test_precision value: [0.8 0.9375 0.97058824 0.875 0.83098592 0.86111111 0.86764706 0.97222222 0.96153846 0.81012658] mean value: 0.8886719586760881 key: train_precision value: [0.83976261 0.944 0.95625 0.92164179 0.91176471 0.96352584 0.8658147 0.96103896 0.96309963 0.78639456] mean value: 0.9113292790413379 key: test_recall value: [0.90909091 0.67164179 0.5 0.84848485 0.89393939 0.46969697 0.89393939 0.53030303 0.37878788 0.96969697] mean value: 0.706558118498417 key: train_recall value: [0.9512605 0.79461279 0.51428571 0.8302521 0.83361345 0.53277311 0.91092437 0.49747899 0.43865546 0.97142857] mean value: 0.7275285063520358 key: test_roc_auc value: [0.84260516 0.81309362 0.74242424 0.86363636 0.85606061 0.6969697 0.87878788 0.75757576 0.68181818 0.87121212] mean value: 0.8004183627317956 key: train_roc_auc value: [0.88472116 0.87377699 0.74537815 0.87983193 0.87647059 0.75630252 0.88487395 0.73865546 0.71092437 0.85378151] mean value: 0.8204716634128398 key: test_jcc value: [0.74074074 0.64285714 0.49253731 0.75675676 0.75641026 0.43661972 0.78666667 0.52238806 0.37313433 0.79012346] mean value: 0.6298234440024083 key: train_jcc value: [0.80512091 0.75884244 0.50246305 0.7755102 0.77138414 0.52224053 0.7982327 0.48764415 0.43140496 0.76861702] mean value: 0.6621460103083406 MCC on Blind test: 0.7 Accuracy on Blind test: 0.88 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.46430111 0.63151526 0.62315559 0.49676085 0.48500848 0.48636103 0.56049109 0.52820277 0.47167945 0.49920678] mean value: 0.5246682405471802 key: score_time value: [0.02340293 0.02379346 0.040169 0.02477193 0.02459431 0.02452683 0.02468944 0.023803 0.02333736 0.02350903] mean value: 0.025659728050231933 key: test_mcc value: [0.89567796 0.80525437 0.84848485 0.86853519 0.89901011 0.82158384 0.89404202 0.77352678 0.9251987 0.833429 ] mean value: 0.8564742825088401 key: train_mcc value: [0.91641203 0.94997419 0.91334836 0.92034737 0.90140194 0.92680468 0.91322951 0.90448915 0.90191224 0.92034737] mean value: 0.916826682329092 key: test_accuracy value: [0.94736842 0.90225564 0.92424242 0.93181818 0.9469697 0.90909091 0.9469697 0.88636364 0.96212121 0.91666667] mean value: 0.92738664843928 key: train_accuracy value: [0.95794786 0.97476871 0.95630252 0.95966387 0.95042017 0.96302521 0.95630252 0.95210084 0.95042017 0.95966387] mean value: 0.9580615728208861 key: test_fscore value: [0.94814815 0.90510949 0.92424242 0.9352518 0.94964029 0.91304348 0.94736842 0.88888889 0.96296296 0.91729323] mean value: 0.9291949132020663 key: train_fscore value: [0.95867769 0.97512438 0.95716639 0.96059113 0.95127993 0.96375618 0.95709571 0.9526971 0.95159967 0.96059113] mean value: 0.958857931089391 key: test_precision value: [0.92753623 0.88571429 0.92424242 0.89041096 0.90410959 0.875 0.94029851 0.86956522 0.94202899 0.91044776] mean value: 0.906935396134124 key: train_precision value: [0.94308943 0.96078431 0.93861066 0.93900482 0.93506494 0.9450727 0.94003241 0.94098361 0.92948718 0.93900482] mean value: 0.941113487171725 key: test_recall value: [0.96969697 0.92537313 0.92424242 0.98484848 1. 0.95454545 0.95454545 0.90909091 0.98484848 0.92424242] mean value: 0.9531433740388964 key: train_recall value: [0.97478992 0.98989899 0.97647059 0.98319328 0.96806723 0.98319328 0.97478992 0.96470588 0.97478992 0.98319328] mean value: 0.9773092267209914 key: test_roc_auc value: [0.94753505 0.90208051 0.92424242 0.93181818 0.9469697 0.90909091 0.9469697 0.88636364 0.96212121 0.91666667] mean value: 0.9273857982813207 key: train_roc_auc value: [0.95793368 0.97478143 0.95630252 0.95966387 0.95042017 0.96302521 0.95630252 0.95210084 0.95042017 0.95966387] mean value: 0.9580614265908384 key: test_jcc value: [0.90140845 0.82666667 0.85915493 0.87837838 0.90410959 0.84 0.9 0.8 0.92857143 0.84722222] mean value: 0.8685511665161482 key: train_jcc value: [0.92063492 0.95145631 0.9178515 0.92417062 0.90708661 0.93004769 0.91772152 0.90966719 0.90766823 0.92417062] mean value: 0.9210475218786636 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.26204967 0.27810788 0.28340149 0.26984215 0.28152084 0.25897765 0.30171061 0.28025818 0.28854394 0.30127692] mean value: 0.28056893348693845 key: score_time value: [0.02732182 0.02710104 0.03132987 0.02990603 0.03089166 0.0326097 0.03289962 0.02989483 0.0305655 0.03311324] mean value: 0.03056333065032959 key: test_mcc value: [0.92577528 0.74631701 0.91287093 0.91287093 0.89651574 0.86853519 0.93939394 0.95553309 0.93982555 0.8824419 ] mean value: 0.8980079544979622 key: train_mcc value: [0.9932941 0.99495514 0.99497063 0.99497063 0.99832074 0.99495939 0.99663866 0.99663866 0.99663866 0.99664429] mean value: 0.995803087652558 key: test_accuracy value: [0.96240602 0.87218045 0.95454545 0.95454545 0.9469697 0.93181818 0.96969697 0.97727273 0.96969697 0.93939394] mean value: 0.9478525860104807 key: train_accuracy value: [0.99663583 0.99747687 0.99747899 0.99747899 0.99915966 0.99747899 0.99831933 0.99831933 0.99831933 0.99831933] mean value: 0.9978986649327519 key: test_fscore value: [0.96296296 0.87769784 0.95652174 0.95652174 0.94890511 0.9352518 0.96969697 0.97777778 0.97014925 0.94202899] mean value: 0.949751417771399 key: train_fscore value: [0.99664992 0.99747262 0.99748533 0.99748533 0.99916037 0.99748111 0.99831933 0.99831933 0.99831933 0.99832215] mean value: 0.9979014807088672 key: test_precision value: [0.94202899 0.84722222 0.91666667 0.91666667 0.91549296 0.89041096 0.96969697 0.95652174 0.95588235 0.90277778] mean value: 0.9213367297259749 key: train_precision value: [0.9933222 0.99831366 0.99498328 0.99498328 0.99832215 0.9966443 0.99831933 0.99831933 0.99831933 0.99664992] mean value: 0.9968176760610129 key: test_recall value: [0.98484848 0.91044776 1. 1. 0.98484848 0.98484848 0.96969697 1. 0.98484848 0.98484848] mean value: 0.9804387155133424 key: train_recall value: [1. 0.996633 1. 1. 1. 0.99831933 0.99831933 0.99831933 0.99831933 1. ] mean value: 0.9989910307557366 key: test_roc_auc value: [0.9625735 0.87189055 0.95454545 0.95454545 0.9469697 0.93181818 0.96969697 0.97727273 0.96969697 0.93939394] mean value: 0.9478403437358661 key: train_roc_auc value: [0.996633 0.99747616 0.99747899 0.99747899 0.99915966 0.99747899 0.99831933 0.99831933 0.99831933 0.99831933] mean value: 0.9978983108394873 key: test_jcc value: [0.92857143 0.78205128 0.91666667 0.91666667 0.90277778 0.87837838 0.94117647 0.95652174 0.94202899 0.89041096] mean value: 0.9055250354242226 key: train_jcc value: [0.9933222 0.99495798 0.99498328 0.99498328 0.99832215 0.99497487 0.9966443 0.9966443 0.9966443 0.99664992] mean value: 0.9958126566226824 MCC on Blind test: 0.79 Accuracy on Blind test: 0.92 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [1.2047646 1.20868587 1.12930322 1.05966353 1.07176399 1.15874434 1.15021348 1.16694856 1.16003871 1.15227175] mean value: 1.1462398052215577 key: score_time value: [0.0769465 0.0892241 0.07288265 0.08414268 0.07057261 0.07313824 0.07241249 0.07278395 0.07415533 0.07414198] mean value: 0.07604005336761474 key: test_mcc value: [0.80689237 0.77857748 0.69986771 0.83573501 0.83806027 0.72222273 0.85478752 0.85478752 0.84119102 0.78368849] mean value: 0.8015810125001238 key: train_mcc value: [0.95661128 0.96170619 0.95501173 0.95684323 0.96194386 0.96487079 0.96173716 0.95684323 0.96509988 0.95858042] mean value: 0.9599247751275188 key: test_accuracy value: [0.90225564 0.88721805 0.84848485 0.91666667 0.91666667 0.85606061 0.92424242 0.92424242 0.91666667 0.88636364] mean value: 0.8978867623604465 key: train_accuracy value: [0.97813288 0.98065601 0.97731092 0.97815126 0.98067227 0.98235294 0.98067227 0.97815126 0.98235294 0.9789916 ] mean value: 0.9797444360418683 key: test_fscore value: [0.90510949 0.89361702 0.85507246 0.91970803 0.92086331 0.86713287 0.92857143 0.92857143 0.92198582 0.8951049 ] mean value: 0.9035736747628861 key: train_fscore value: [0.97844113 0.98091286 0.97763049 0.9785124 0.98100743 0.98251457 0.98094449 0.9785124 0.98260149 0.9793559 ] mean value: 0.9800433162018617 key: test_precision value: [0.87323944 0.85135135 0.81944444 0.88732394 0.87671233 0.80519481 0.87837838 0.87837838 0.86666667 0.83116883] mean value: 0.856785856463167 key: train_precision value: [0.96563011 0.96726678 0.96405229 0.96260163 0.96428571 0.97359736 0.96732026 0.96260163 0.96895425 0.96266234] mean value: 0.9658972351445866 key: test_recall value: [0.93939394 0.94029851 0.89393939 0.95454545 0.96969697 0.93939394 0.98484848 0.98484848 0.98484848 0.96969697] mean value: 0.9561510628674807 key: train_recall value: [0.99159664 0.99494949 0.99159664 0.99495798 0.99831933 0.99159664 0.99495798 0.99495798 0.99663866 0.99663866] mean value: 0.9946209999151175 key: test_roc_auc value: [0.90253279 0.88681592 0.84848485 0.91666667 0.91666667 0.85606061 0.92424242 0.92424242 0.91666667 0.88636364] mean value: 0.8978742650384441 key: train_roc_auc value: [0.97812155 0.98066802 0.97731092 0.97815126 0.98067227 0.98235294 0.98067227 0.97815126 0.98235294 0.9789916 ] mean value: 0.979744503862151 key: test_jcc value: [0.82666667 0.80769231 0.74683544 0.85135135 0.85333333 0.7654321 0.86666667 0.86666667 0.85526316 0.81012658] mean value: 0.8250034274353617 key: train_jcc value: [0.95779221 0.96254072 0.95623987 0.9579288 0.96272285 0.96563011 0.96260163 0.9579288 0.96579805 0.95954693] mean value: 0.9608729964186585 MCC on Blind test: 0.59 Accuracy on Blind test: 0.84 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [2.13605762 2.16598344 2.45498848 2.29848862 2.23179269 2.26086473 2.53592467 2.4032464 1.9804318 1.53801203] mean value: 2.200579047203064 key: score_time value: [0.01504397 0.01493168 0.01365137 0.02563953 0.01351357 0.01368713 0.0136621 0.01364279 0.0099721 0.01013899] mean value: 0.014388322830200195 key: test_mcc value: [0.88134139 0.85299767 0.92690611 0.89901011 0.92690611 0.88531564 0.9701425 0.91287093 0.95553309 0.8824419 ] mean value: 0.909346543788278 key: train_mcc value: [0.96662567 0.97665016 0.97666924 0.97009871 0.968375 0.9684626 0.97337873 0.97337873 0.97173741 0.97666924] mean value: 0.9722045482625817 key: test_accuracy value: [0.93984962 0.92481203 0.96212121 0.9469697 0.96212121 0.93939394 0.98484848 0.95454545 0.97727273 0.93939394] mean value: 0.9531328320802005 key: train_accuracy value: [0.98317914 0.9882254 0.98823529 0.98487395 0.98403361 0.98403361 0.98655462 0.98655462 0.98571429 0.98823529] mean value: 0.985963983574927 key: test_fscore value: [0.94117647 0.92857143 0.96350365 0.94964029 0.96350365 0.94285714 0.98507463 0.95652174 0.97777778 0.94202899] mean value: 0.9550655758337795 key: train_fscore value: [0.9833887 0.98833333 0.98835275 0.98507463 0.98423237 0.98425849 0.98671096 0.98671096 0.98589212 0.98835275] mean value: 0.9861307055733873 key: test_precision value: [0.91428571 0.89041096 0.92957746 0.90410959 0.92957746 0.89189189 0.97058824 0.91666667 0.95652174 0.90277778] mean value: 0.9206407502569274 key: train_precision value: [0.97208539 0.97854785 0.9785832 0.97217676 0.97213115 0.97058824 0.97536946 0.97536946 0.97377049 0.9785832 ] mean value: 0.9747205183061565 key: test_recall value: [0.96969697 0.97014925 1. 1. 1. 1. 1. 1. 1. 0.98484848] mean value: 0.9924694708276798 key: train_recall value: [0.99495798 0.9983165 0.99831933 0.99831933 0.99663866 0.99831933 0.99831933 0.99831933 0.99831933 0.99831933] mean value: 0.9978148431089607 key: test_roc_auc value: [0.94007237 0.92446857 0.96212121 0.9469697 0.96212121 0.93939394 0.98484848 0.95454545 0.97727273 0.93939394] mean value: 0.9531207598371777 key: train_roc_auc value: [0.98316923 0.98823388 0.98823529 0.98487395 0.98403361 0.98403361 0.98655462 0.98655462 0.98571429 0.98823529] mean value: 0.9859638400814871 key: test_jcc value: [0.88888889 0.86666667 0.92957746 0.90410959 0.92957746 0.89189189 0.97058824 0.91666667 0.95652174 0.89041096] mean value: 0.9144899566061336 key: train_jcc value: [0.96732026 0.97693575 0.97697368 0.97058824 0.96895425 0.96900489 0.97377049 0.97377049 0.97217676 0.97697368] mean value: 0.9726468500088701 MCC on Blind test: 0.78 Accuracy on Blind test: 0.91 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.04838514 0.05437803 0.05212498 0.05181313 0.09965992 0.10766077 0.0892477 0.06327462 0.06029487 0.05176997] mean value: 0.06786091327667236 key: score_time value: [0.01332021 0.01374054 0.01412916 0.01401949 0.02348638 0.0190959 0.0138495 0.01379108 0.01380205 0.02299404] mean value: 0.016222834587097168 key: test_mcc value: [0.31255936 0.27144125 0.1767767 0.28629917 0.15027827 0.23664319 0.28629917 0.21821789 0.21038958 0.19682713] mean value: 0.2345731703657738 key: train_mcc value: [0.25968885 0.25367309 0.2611946 0.25377296 0.26846242 0.29285905 0.25750387 0.27025687 0.25935415 0.25564355] mean value: 0.26324094119122415 key: test_accuracy value: [0.58646617 0.57142857 0.53030303 0.57575758 0.53787879 0.5530303 0.57575758 0.54545455 0.5530303 0.56060606] mean value: 0.5589712918660287 key: train_accuracy value: [0.56349874 0.56013457 0.56386555 0.5605042 0.56722689 0.5789916 0.56218487 0.56806723 0.56302521 0.56134454] mean value: 0.5648843389332183 key: test_fscore value: [0.70588235 0.70157068 0.68041237 0.70212766 0.67724868 0.69109948 0.70212766 0.6875 0.68783069 0.68478261] mean value: 0.6920582174067214 key: train_fscore value: [0.69631363 0.6943308 0.69631363 0.69468768 0.69794721 0.70372561 0.69549971 0.69835681 0.69590643 0.69509346] mean value: 0.6968174976741559 key: test_precision value: [0.54545455 0.54032258 0.515625 0.54098361 0.5203252 0.528 0.54098361 0.52380952 0.52845528 0.53389831] mean value: 0.5317857655913608 key: train_precision value: [0.53411131 0.53178156 0.53411131 0.53220036 0.53603604 0.54288321 0.53315412 0.53651939 0.53363229 0.53267681] mean value: 0.5347106393011474 key: test_recall value: [1. 1. 1. 1. 0.96969697 1. 1. 1. 0.98484848 0.95454545] mean value: 0.990909090909091 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.58955224 0.56818182 0.53030303 0.57575758 0.53787879 0.5530303 0.57575758 0.54545455 0.5530303 0.56060606] mean value: 0.558955223880597 key: train_roc_auc value: [0.56313131 0.5605042 0.56386555 0.5605042 0.56722689 0.5789916 0.56218487 0.56806723 0.56302521 0.56134454] mean value: 0.5648845598845599 key: test_jcc value: [0.54545455 0.54032258 0.515625 0.54098361 0.512 0.528 0.54098361 0.52380952 0.52419355 0.52066116] mean value: 0.5292033568435874 key: train_jcc value: [0.53411131 0.53178156 0.53411131 0.53220036 0.53603604 0.54288321 0.53315412 0.53651939 0.53363229 0.53267681] mean value: 0.5347106393011474 MCC on Blind test: 0.11 Accuracy on Blind test: 0.38 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.06216955 0.03932023 0.06577659 0.04073477 0.04673862 0.02940083 0.03316712 0.05854964 0.04999566 0.0483737 ] mean value: 0.0474226713180542 key: score_time value: [0.02970314 0.0334301 0.03142047 0.02710342 0.0294013 0.01831245 0.020015 0.02023387 0.01992059 0.02014923] mean value: 0.024968957901000975 key: test_mcc value: [0.6695318 0.62406697 0.69986771 0.80758535 0.76237471 0.69825325 0.77495429 0.75792383 0.81818182 0.73029674] mean value: 0.7343036471383053 key: train_mcc value: [0.77850192 0.79341585 0.77335768 0.77317772 0.7797431 0.77672743 0.77477537 0.77870084 0.76803819 0.77766758] mean value: 0.7774105692657516 key: test_accuracy value: [0.83458647 0.81203008 0.84848485 0.90151515 0.87121212 0.84848485 0.88636364 0.87878788 0.90909091 0.86363636] mean value: 0.8654192298929141 key: train_accuracy value: [0.8881413 0.89571068 0.88571429 0.88571429 0.88907563 0.88739496 0.88655462 0.88823529 0.88319328 0.88823529] mean value: 0.8877969623509623 key: test_fscore value: [0.8358209 0.81481481 0.85507246 0.90647482 0.88435374 0.85294118 0.89051095 0.88059701 0.90909091 0.86956522] mean value: 0.8699242002529087 key: train_fscore value: [0.89230769 0.89918699 0.88961039 0.88943089 0.89250814 0.89123377 0.8901546 0.89230769 0.88689992 0.89125102] mean value: 0.8914891107904296 key: test_precision value: [0.82352941 0.80882353 0.81944444 0.8630137 0.80246914 0.82857143 0.85915493 0.86764706 0.90909091 0.83333333] mean value: 0.8415077879450187 key: train_precision value: [0.8609375 0.86949686 0.86028257 0.86141732 0.8657188 0.86185243 0.86277603 0.8609375 0.85962145 0.86783439] mean value: 0.8630874856643093 key: test_recall value: [0.84848485 0.82089552 0.89393939 0.95454545 0.98484848 0.87878788 0.92424242 0.89393939 0.90909091 0.90909091] mean value: 0.9017865219357757 key: train_recall value: [0.92605042 0.93097643 0.9210084 0.91932773 0.9210084 0.92268908 0.91932773 0.92605042 0.91596639 0.91596639] mean value: 0.9218371388959624 key: test_roc_auc value: [0.83469019 0.81196291 0.84848485 0.90151515 0.87121212 0.84848485 0.88636364 0.87878788 0.90909091 0.86363636] mean value: 0.8654228855721393 key: train_roc_auc value: [0.88810939 0.89574032 0.88571429 0.88571429 0.88907563 0.88739496 0.88655462 0.88823529 0.88319328 0.88823529] mean value: 0.8877967348555583 key: test_jcc value: [0.71794872 0.6875 0.74683544 0.82894737 0.79268293 0.74358974 0.80263158 0.78666667 0.83333333 0.76923077] mean value: 0.7709366548004895 key: train_jcc value: [0.80555556 0.816839 0.80116959 0.80087848 0.80588235 0.80380673 0.80205279 0.80555556 0.79678363 0.80383481] mean value: 0.8042358482477265 MCC on Blind test: 0.65 Accuracy on Blind test: 0.86 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.45398355 0.64111161 0.66144133 0.64707065 0.74668407 0.61079025 0.59106517 0.64666867 0.64362574 0.66236663] mean value: 0.6304807662963867 key: score_time value: [0.02003956 0.02357125 0.02011156 0.01998615 0.02011824 0.02004075 0.02002525 0.02092671 0.02399397 0.02006865] mean value: 0.020888209342956543 key: test_mcc value: [0.6695318 0.62406697 0.68568568 0.80758535 0.76237471 0.69825325 0.77495429 0.78824078 0.80386117 0.73029674] mean value: 0.7344850738674684 key: train_mcc value: [0.77850192 0.79341585 0.78824973 0.77317772 0.78279168 0.77672743 0.77477537 0.78505932 0.78455181 0.77766758] mean value: 0.7814918416441247 key: test_accuracy value: [0.83458647 0.81203008 0.84090909 0.90151515 0.87121212 0.84848485 0.88636364 0.89393939 0.90151515 0.86363636] mean value: 0.8654192298929141 key: train_accuracy value: [0.8881413 0.89571068 0.89327731 0.88571429 0.8907563 0.88739496 0.88655462 0.89159664 0.89159664 0.88823529] mean value: 0.8898978026870967 key: test_fscore value: [0.8358209 0.81481481 0.84892086 0.90647482 0.88435374 0.85294118 0.89051095 0.89552239 0.9037037 0.86956522] mean value: 0.8702628569817447 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:156: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:159: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.89230769 0.89918699 0.89666395 0.88943089 0.89379085 0.89123377 0.8901546 0.89520715 0.89469388 0.89125102] mean value: 0.8933920794349053 key: test_precision value: [0.82352941 0.80882353 0.80821918 0.8630137 0.80246914 0.82857143 0.85915493 0.88235294 0.88405797 0.83333333] mean value: 0.8393525557364458 key: train_precision value: [0.8609375 0.86949686 0.86908517 0.86141732 0.86963434 0.86185243 0.86277603 0.8663522 0.86984127 0.86783439] mean value: 0.8659227516425898 key: test_recall value: [0.84848485 0.82089552 0.89393939 0.95454545 0.98484848 0.87878788 0.92424242 0.90909091 0.92424242 0.90909091] mean value: 0.9048168249660787 key: train_recall value: [0.92605042 0.93097643 0.92605042 0.91932773 0.91932773 0.92268908 0.91932773 0.92605042 0.9210084 0.91596639] mean value: 0.9226774750304162 key: test_roc_auc value: [0.83469019 0.81196291 0.84090909 0.90151515 0.87121212 0.84848485 0.88636364 0.89393939 0.90151515 0.86363636] mean value: 0.8654228855721393 key: train_roc_auc value: [0.88810939 0.89574032 0.89327731 0.88571429 0.8907563 0.88739496 0.88655462 0.89159664 0.89159664 0.88823529] mean value: 0.8898975751916928 key: test_jcc value: [0.71794872 0.6875 0.7375 0.82894737 0.79268293 0.74358974 0.80263158 0.81081081 0.82432432 0.76923077] mean value: 0.7715166240102056 key: train_jcc value: [0.80555556 0.816839 0.81268437 0.80087848 0.80797637 0.80380673 0.80205279 0.81029412 0.80945347 0.80383481] mean value: 0.8073375678553497 MCC on Blind test: 0.65 Accuracy on Blind test: 0.86 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03648067 0.06293559 0.11051536 0.10283589 0.07630587 0.05248928 0.04713821 0.03610635 0.04521561 0.04661322] mean value: 0.06166360378265381 key: score_time value: [0.0122385 0.0123806 0.02309442 0.0224843 0.01422787 0.01225924 0.01240015 0.01217985 0.0121851 0.01242042] mean value: 0.014587044715881348 key: test_mcc value: [0.79666667 0.60104076 0.51854497 0.63333333 0.63819901 0.71889189 0.42833333 0.6363961 0.62994079 0.71393289] mean value: 0.6315279761986434 key: train_mcc value: [0.72854342 0.73336204 0.78336839 0.72770459 0.77058957 0.73776759 0.74378449 0.73776759 0.76518728 0.72920376] mean value: 0.7457278720621128 key: test_accuracy value: [0.89795918 0.79591837 0.75510204 0.81632653 0.81632653 0.85714286 0.71428571 0.81632653 0.8125 0.85416667] mean value: 0.8136054421768707 key: train_accuracy value: [0.86332574 0.86560364 0.89066059 0.86332574 0.88382688 0.86788155 0.87015945 0.86788155 0.88181818 0.86363636] mean value: 0.8718119693518327 key: test_fscore value: [0.89795918 0.80769231 0.76923077 0.81632653 0.80851064 0.86792453 0.72 0.83018868 0.82352941 0.8627451 ] mean value: 0.8204107146857755 key: train_fscore value: [0.86842105 0.87089716 0.89473684 0.86725664 0.88840263 0.8722467 0.87581699 0.8722467 0.88546256 0.86842105] mean value: 0.8763908306318798 key: test_precision value: [0.88 0.75 0.71428571 0.8 0.86363636 0.82142857 0.72 0.78571429 0.77777778 0.81481481] mean value: 0.7927657527657528 key: train_precision value: [0.83898305 0.83966245 0.86440678 0.84482759 0.85294118 0.84255319 0.8375 0.84255319 0.85897436 0.83898305] mean value: 0.8461384833243883 key: test_recall value: [0.91666667 0.875 0.83333333 0.83333333 0.76 0.92 0.72 0.88 0.875 0.91666667] mean value: 0.853 key: train_recall value: [0.9 0.90454545 0.92727273 0.89090909 0.92694064 0.90410959 0.91780822 0.90410959 0.91363636 0.9 ] mean value: 0.9089331672893317 key: test_roc_auc value: [0.89833333 0.7975 0.75666667 0.81666667 0.8175 0.85583333 0.71416667 0.815 0.8125 0.85416667] mean value: 0.8138333333333333 key: train_roc_auc value: [0.86324201 0.86551474 0.890577 0.86326276 0.88392487 0.86796389 0.87026775 0.86796389 0.88181818 0.86363636] mean value: 0.8718171440431715 key: test_jcc value: [0.81481481 0.67741935 0.625 0.68965517 0.67857143 0.76666667 0.5625 0.70967742 0.7 0.75862069] mean value: 0.6982925546315424 key: train_jcc value: [0.76744186 0.77131783 0.80952381 0.765625 0.7992126 0.7734375 0.77906977 0.7734375 0.7944664 0.76744186] mean value: 0.7800974128940519 MCC on Blind test: 0.7 Accuracy on Blind test: 0.88 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.82624602 0.93945646 0.98582363 0.9221282 1.18374133 1.19921923 0.99356842 1.23624206 1.00866294 1.01761222] mean value: 1.0312700510025024 key: score_time value: [0.01234579 0.0122726 0.01226354 0.01245856 0.02000523 0.01238537 0.01224136 0.01265335 0.0125165 0.0130868 ] mean value: 0.013222908973693848 key: test_mcc value: [0.755 0.51854497 0.48142509 0.67333333 0.55166667 0.71889189 0.42833333 0.6363961 0.62994079 0.71393289] mean value: 0.6107465072859045 key: train_mcc value: [0.69679909 0.71561303 0.74250278 0.7006985 0.80039507 0.70658033 0.72615613 0.69802628 0.71631274 0.72543774] mean value: 0.7228521690037979 key: test_accuracy value: [0.87755102 0.75510204 0.73469388 0.83673469 0.7755102 0.85714286 0.71428571 0.81632653 0.8125 0.85416667] mean value: 0.8034013605442176 key: train_accuracy value: [0.84738041 0.85649203 0.87015945 0.84965831 0.89977221 0.85193622 0.86104784 0.84738041 0.85681818 0.86136364] mean value: 0.8602008697452889 key: test_fscore value: [0.875 0.76923077 0.75471698 0.83333333 0.7755102 0.86792453 0.72 0.83018868 0.82352941 0.8627451 ] mean value: 0.8112179005128902 key: train_fscore value: [0.85339168 0.8627451 0.87527352 0.85462555 0.90178571 0.85776805 0.86767896 0.8540305 0.8627451 0.8671024 ] mean value: 0.8657146577807547 key: test_precision value: [0.875 0.71428571 0.68965517 0.83333333 0.79166667 0.82142857 0.72 0.78571429 0.77777778 0.81481481] mean value: 0.7823676336434957 key: train_precision value: [0.82278481 0.82845188 0.84388186 0.82905983 0.88209607 0.82352941 0.82644628 0.81666667 0.82845188 0.83263598] mean value: 0.8334004673972575 key: test_recall value: [0.875 0.83333333 0.83333333 0.83333333 0.76 0.92 0.72 0.88 0.875 0.91666667] mean value: 0.8446666666666667 key: train_recall value: [0.88636364 0.9 0.90909091 0.88181818 0.92237443 0.89497717 0.91324201 0.89497717 0.9 0.90454545] mean value: 0.900738895807389 key: test_roc_auc value: [0.8775 0.75666667 0.73666667 0.83666667 0.77583333 0.85583333 0.71416667 0.815 0.8125 0.85416667] mean value: 0.8035 key: train_roc_auc value: [0.84729141 0.85639269 0.87007057 0.84958489 0.89982358 0.85203404 0.86116646 0.84748858 0.85681818 0.86136364] mean value: 0.860203403902034 key: test_jcc value: [0.77777778 0.625 0.60606061 0.71428571 0.63333333 0.76666667 0.5625 0.70967742 0.7 0.75862069] mean value: 0.6853922207134109 key: train_jcc value: [0.74427481 0.75862069 0.77821012 0.74615385 0.82113821 0.75095785 0.76628352 0.74524715 0.75862069 0.76538462] mean value: 0.7634891505722061 MCC on Blind test: 0.67 Accuracy on Blind test: 0.86 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.0174191 0.01351547 0.01054788 0.01021028 0.01021981 0.01003218 0.01009798 0.01021171 0.00998211 0.01007557] mean value: 0.011231207847595214 key: score_time value: [0.01329041 0.01151371 0.00949287 0.00919437 0.00917673 0.00907946 0.00908232 0.00900674 0.0090127 0.00906205] mean value: 0.009791135787963867 key: test_mcc value: [0.55091896 0.48659733 0.39836231 0.40083017 0.35355339 0.55166667 0.2710576 0.55166667 0.37796447 0.54213748] mean value: 0.4484755049286746 key: train_mcc value: [0.49626427 0.53329675 0.49606557 0.50801433 0.57619616 0.48932809 0.55172318 0.47215524 0.53099395 0.49437368] mean value: 0.5148411209141024 key: test_accuracy value: [0.7755102 0.73469388 0.69387755 0.69387755 0.67346939 0.7755102 0.63265306 0.7755102 0.6875 0.77083333] mean value: 0.7213435374149659 key: train_accuracy value: [0.74715262 0.76537585 0.74715262 0.75170843 0.78359909 0.74259681 0.77220957 0.73348519 0.76363636 0.74545455] mean value: 0.7552371091323256 key: test_fscore value: [0.76595745 0.68292683 0.71698113 0.63414634 0.65217391 0.7755102 0.60869565 0.7755102 0.66666667 0.76595745] mean value: 0.7044525836471524 key: train_fscore value: [0.73634204 0.75417661 0.75816993 0.73479319 0.76190476 0.72371638 0.75124378 0.71111111 0.74879227 0.7294686 ] mean value: 0.740971868081603 key: test_precision value: [0.7826087 0.82352941 0.65517241 0.76470588 0.71428571 0.79166667 0.66666667 0.79166667 0.71428571 0.7826087 ] mean value: 0.7487196527786527 key: train_precision value: [0.77114428 0.79396985 0.72803347 0.79057592 0.84444444 0.77894737 0.82513661 0.77419355 0.79896907 0.77835052] mean value: 0.7883765077790228 key: test_recall value: [0.75 0.58333333 0.79166667 0.54166667 0.6 0.76 0.56 0.76 0.625 0.75 ] mean value: 0.6721666666666667 key: train_recall value: [0.70454545 0.71818182 0.79090909 0.68636364 0.69406393 0.67579909 0.68949772 0.65753425 0.70454545 0.68636364] mean value: 0.7007804068078041 key: test_roc_auc value: [0.775 0.73166667 0.69583333 0.69083333 0.675 0.77583333 0.63416667 0.77583333 0.6875 0.77083333] mean value: 0.7212500000000001 key: train_roc_auc value: [0.7472499 0.7654836 0.74705272 0.75185762 0.7833956 0.742445 0.77202159 0.73331258 0.76363636 0.74545455] mean value: 0.7551909506019095 key: test_jcc value: [0.62068966 0.51851852 0.55882353 0.46428571 0.48387097 0.63333333 0.4375 0.63333333 0.5 0.62068966] mean value: 0.5471044706969427 key: train_jcc value: [0.58270677 0.60536398 0.61052632 0.58076923 0.61538462 0.56704981 0.60159363 0.55172414 0.5984556 0.57414449] mean value: 0.5887718570540718 MCC on Blind test: 0.45 Accuracy on Blind test: 0.78 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01025057 0.01029849 0.01029444 0.01029325 0.01027894 0.01024628 0.01033664 0.01036143 0.01040602 0.01041937] mean value: 0.010318541526794433 key: score_time value: [0.00901389 0.00901246 0.00904012 0.009022 0.00901365 0.00907755 0.0090878 0.00918317 0.00921178 0.00922823] mean value: 0.009089064598083497 key: test_mcc value: [0.715 0.38833333 0.39196475 0.51252158 0.34666667 0.59839104 0.63333333 0.55166667 0.58536941 0.58333333] mean value: 0.5306580112563036 key: train_mcc value: [0.59080781 0.61303798 0.60465079 0.62803099 0.60397855 0.62220982 0.59975096 0.59009009 0.5917901 0.56832862] mean value: 0.6012675717929086 key: test_accuracy value: [0.85714286 0.69387755 0.69387755 0.75510204 0.67346939 0.79591837 0.81632653 0.7755102 0.79166667 0.79166667] mean value: 0.7644557823129251 key: train_accuracy value: [0.79498861 0.80637813 0.80182232 0.81321185 0.80182232 0.81093394 0.79954442 0.79498861 0.79545455 0.78409091] mean value: 0.8003235659556844 key: test_fscore value: [0.85714286 0.69387755 0.70588235 0.76 0.68 0.81481481 0.81632653 0.7755102 0.8 0.79166667] mean value: 0.7695220977279801 key: train_fscore value: [0.80088496 0.8098434 0.80794702 0.82017544 0.80449438 0.81348315 0.80357143 0.79638009 0.80088496 0.7816092 ] mean value: 0.8039274012977246 key: test_precision value: [0.84 0.68 0.66666667 0.73076923 0.68 0.75862069 0.83333333 0.79166667 0.76923077 0.79166667] mean value: 0.7541954022988506 key: train_precision value: [0.78017241 0.79735683 0.78540773 0.79237288 0.7920354 0.80088496 0.7860262 0.78923767 0.78017241 0.79069767] mean value: 0.7894364159893563 key: test_recall value: [0.875 0.70833333 0.75 0.79166667 0.68 0.88 0.8 0.76 0.83333333 0.79166667] mean value: 0.787 key: train_recall value: [0.82272727 0.82272727 0.83181818 0.85 0.8173516 0.82648402 0.82191781 0.80365297 0.82272727 0.77272727] mean value: 0.8192133665421336 key: test_roc_auc value: [0.8575 0.69416667 0.695 0.75583333 0.67333333 0.79416667 0.81666667 0.77583333 0.79166667 0.79166667] mean value: 0.7645833333333334 key: train_roc_auc value: [0.79492528 0.80634081 0.80175384 0.81312785 0.80185762 0.81096928 0.79959527 0.7950083 0.79545455 0.78409091] mean value: 0.8003123702781237 key: test_jcc value: [0.75 0.53125 0.54545455 0.61290323 0.51515152 0.6875 0.68965517 0.63333333 0.66666667 0.65517241] mean value: 0.6287086872619408 key: train_jcc value: [0.66789668 0.68045113 0.67777778 0.69516729 0.67293233 0.68560606 0.67164179 0.66165414 0.66789668 0.64150943] mean value: 0.6722533301554774 MCC on Blind test: 0.61 Accuracy on Blind test: 0.84 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01000905 0.01069713 0.01083493 0.01071715 0.0106926 0.01064563 0.01084876 0.01088238 0.01057363 0.01007628] mean value: 0.010597753524780273 key: score_time value: [0.0182538 0.01808405 0.01793838 0.01800442 0.01830792 0.01789689 0.01869392 0.01868653 0.01774597 0.01882052] mean value: 0.018243241310119628 key: test_mcc value: [0.14333333 0.34666667 0.34891534 0.05892557 0.22370649 0.47 0.39196475 0.38731273 0.46195658 0.58536941] mean value: 0.34181508579127046 key: train_mcc value: [0.58110664 0.58999108 0.54898298 0.56719733 0.59933565 0.58999108 0.56745316 0.59908676 0.54581553 0.54095939] mean value: 0.5729919588657677 key: test_accuracy value: [0.57142857 0.67346939 0.67346939 0.53061224 0.6122449 0.73469388 0.69387755 0.69387755 0.72916667 0.79166667] mean value: 0.6704506802721089 key: train_accuracy value: [0.7904328 0.79498861 0.77448747 0.78359909 0.79954442 0.79498861 0.78359909 0.79954442 0.77272727 0.77045455] mean value: 0.7864366328432387 key: test_fscore value: [0.57142857 0.66666667 0.68 0.48888889 0.62745098 0.73469388 0.68085106 0.70588235 0.74509804 0.7826087 ] mean value: 0.6683569136566129 key: train_fscore value: [0.78801843 0.79638009 0.77448747 0.7845805 0.8018018 0.79357798 0.77958237 0.79908676 0.77678571 0.76887872] mean value: 0.7863179834924426 key: test_precision value: [0.56 0.66666667 0.65384615 0.52380952 0.61538462 0.75 0.72727273 0.69230769 0.7037037 0.81818182] mean value: 0.6711172901172902 key: train_precision value: [0.79906542 0.79279279 0.77625571 0.78280543 0.79111111 0.79723502 0.79245283 0.79908676 0.76315789 0.77419355] mean value: 0.7868156516436422 key: test_recall value: [0.58333333 0.66666667 0.70833333 0.45833333 0.64 0.72 0.64 0.72 0.79166667 0.75 ] mean value: 0.6678333333333333 key: train_recall value: [0.77727273 0.8 0.77272727 0.78636364 0.81278539 0.78995434 0.76712329 0.79908676 0.79090909 0.76363636] mean value: 0.7859858862598589 key: test_roc_auc value: [0.57166667 0.67333333 0.67416667 0.52916667 0.61166667 0.735 0.695 0.69333333 0.72916667 0.79166667] mean value: 0.6704166666666667 key: train_roc_auc value: [0.79046285 0.79497717 0.77449149 0.78359278 0.79957451 0.79497717 0.78356164 0.79954338 0.77272727 0.77045455] mean value: 0.7864362806143628 key: test_jcc value: [0.4 0.5 0.51515152 0.32352941 0.45714286 0.58064516 0.51612903 0.54545455 0.59375 0.64285714] mean value: 0.5074659665919153 key: train_jcc value: [0.65019011 0.66165414 0.63197026 0.64552239 0.66917293 0.65779468 0.63878327 0.66539924 0.6350365 0.62453532] mean value: 0.6480058828667646 MCC on Blind test: 0.4 Accuracy on Blind test: 0.73 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.02519631 0.02429771 0.02343082 0.02211881 0.02026248 0.02048516 0.02434134 0.02253342 0.02340245 0.02184677] mean value: 0.022791528701782228 key: score_time value: [0.01306438 0.01318479 0.0132587 0.01345491 0.01253915 0.01197505 0.0122962 0.01304913 0.01205301 0.01192784] mean value: 0.012680315971374511 key: test_mcc value: [0.83973406 0.49255205 0.55612092 0.7145252 0.59166667 0.74389879 0.51089422 0.55390031 0.59160798 0.72422435] mean value: 0.6319124563642237 key: train_mcc value: [0.68661164 0.72255181 0.7150502 0.70081451 0.70386925 0.69938716 0.72442006 0.66799508 0.69328082 0.68243161] mean value: 0.69964121420511 key: test_accuracy value: [0.91836735 0.73469388 0.7755102 0.85714286 0.79591837 0.85714286 0.75510204 0.7755102 0.79166667 0.85416667] mean value: 0.8115221088435374 key: train_accuracy value: [0.8405467 0.85876993 0.85421412 0.84738041 0.84738041 0.84738041 0.85876993 0.82915718 0.84318182 0.83863636] mean value: 0.8465417270656451 key: test_fscore value: [0.92 0.76363636 0.78431373 0.85106383 0.8 0.87719298 0.76923077 0.79245283 0.80769231 0.86792453] mean value: 0.8233507336783576 key: train_fscore value: [0.85042735 0.86695279 0.86382979 0.85714286 0.85835095 0.85529158 0.86752137 0.84210526 0.85350318 0.84796574] mean value: 0.8563090866702563 key: test_precision value: [0.88461538 0.67741935 0.74074074 0.86956522 0.8 0.78125 0.74074074 0.75 0.75 0.79310345] mean value: 0.7787434886602742 key: train_precision value: [0.80241935 0.82113821 0.812 0.80722892 0.7992126 0.81147541 0.81526104 0.78125 0.80079681 0.80161943] mean value: 0.8052401780268827 key: test_recall value: [0.95833333 0.875 0.83333333 0.83333333 0.8 1. 0.8 0.84 0.875 0.95833333] mean value: 0.8773333333333333 key: train_recall value: [0.90454545 0.91818182 0.92272727 0.91363636 0.92694064 0.90410959 0.92694064 0.91324201 0.91363636 0.9 ] mean value: 0.9143960149439602 key: test_roc_auc value: [0.91916667 0.7375 0.77666667 0.85666667 0.79583333 0.85416667 0.75416667 0.77416667 0.79166667 0.85416667] mean value: 0.8114166666666667 key: train_roc_auc value: [0.84040058 0.85863429 0.8540577 0.84722914 0.84756123 0.84750934 0.85892487 0.82934828 0.84318182 0.83863636] mean value: 0.8465483603154836 key: test_jcc value: [0.85185185 0.61764706 0.64516129 0.74074074 0.66666667 0.78125 0.625 0.65625 0.67741935 0.76666667] mean value: 0.7028653629910746 key: train_jcc value: [0.73977695 0.76515152 0.76029963 0.75 0.75185185 0.74716981 0.76603774 0.72727273 0.74444444 0.73605948] mean value: 0.7488064142585281 MCC on Blind test: 0.68 Accuracy on Blind test: 0.86 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [0.61014509 2.33109617 2.37079334 0.93144178 2.48047662 2.4058454 2.36077499 2.46738005 2.3683176 1.95464587] mean value: 2.0280916929244994 key: score_time value: [0.01267934 0.01379967 0.02186561 0.01337576 0.01495171 0.01306129 0.01289058 0.01424766 0.01542616 0.01264715] mean value: 0.014494490623474122 key: test_mcc value: [0.79666667 0.51 0.43071846 0.6750504 0.55166667 0.7145252 0.46911585 0.55390031 0.54594868 0.50709255] mean value: 0.5754684792348235 key: train_mcc value: [0.7180484 0.95013496 0.96355334 0.76321363 0.95948845 0.93167289 0.96371487 0.95013496 0.96367619 0.83223957] mean value: 0.8995877269556399 key: test_accuracy value: [0.89795918 0.75510204 0.71428571 0.83673469 0.7755102 0.85714286 0.73469388 0.7755102 0.77083333 0.75 ] mean value: 0.7867772108843537 key: train_accuracy value: [0.85421412 0.97494305 0.98177677 0.88154897 0.97949886 0.96583144 0.98177677 0.97494305 0.98181818 0.91590909] mean value: 0.9492260302340029 key: test_fscore value: [0.89795918 0.75 0.72 0.82608696 0.7755102 0.8627451 0.74509804 0.79245283 0.78431373 0.76923077] mean value: 0.7923396806441387 key: train_fscore value: [0.86554622 0.97471264 0.98181818 0.88288288 0.97977528 0.96583144 0.98190045 0.9751693 0.98190045 0.91722595] mean value: 0.9506762798831331 key: test_precision value: [0.88 0.75 0.69230769 0.86363636 0.79166667 0.84615385 0.73076923 0.75 0.74074074 0.71428571] mean value: 0.7759560254560255 key: train_precision value: [0.8046875 0.98604651 0.98181818 0.875 0.96460177 0.96363636 0.97309417 0.96428571 0.97747748 0.9030837 ] mean value: 0.9393731389601265 key: test_recall value: [0.91666667 0.75 0.75 0.79166667 0.76 0.88 0.76 0.84 0.83333333 0.83333333] mean value: 0.8115 key: train_recall value: [0.93636364 0.96363636 0.98181818 0.89090909 0.99543379 0.96803653 0.99086758 0.98630137 0.98636364 0.93181818] mean value: 0.9631548360315483 key: test_roc_auc value: [0.89833333 0.755 0.715 0.83583333 0.77583333 0.85666667 0.73416667 0.77416667 0.77083333 0.75 ] mean value: 0.7865833333333333 key: train_roc_auc value: [0.85402657 0.97496887 0.98177667 0.8815276 0.97953508 0.96583645 0.98179743 0.97496887 0.98181818 0.91590909] mean value: 0.9492164798671648 key: test_jcc value: [0.81481481 0.6 0.5625 0.7037037 0.63333333 0.75862069 0.59375 0.65625 0.64516129 0.625 ] mean value: 0.6593133831829605 key: train_jcc value: [0.76296296 0.95067265 0.96428571 0.79032258 0.96035242 0.9339207 0.96444444 0.95154185 0.96444444 0.84710744] mean value: 0.9090055208512735 MCC on Blind test: 0.68 Accuracy on Blind test: 0.86 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.03068066 0.02850795 0.02653122 0.02450705 0.02715755 0.02531695 0.0240128 0.02486944 0.02457714 0.02573967] mean value: 0.02619004249572754 key: score_time value: [0.01240253 0.00964069 0.00914097 0.00925875 0.00932145 0.00931454 0.00973344 0.00957346 0.00942445 0.00928974] mean value: 0.009710001945495605 key: test_mcc value: [0.63819901 0.67612782 0.63333333 0.63819901 0.67333333 0.55091896 0.7202771 0.68353656 0.75261781 0.9591663 ] mean value: 0.6925709246850608 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.81632653 0.83673469 0.81632653 0.81632653 0.83673469 0.7755102 0.85714286 0.83673469 0.875 0.97916667] mean value: 0.8446003401360545 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.82352941 0.84 0.81632653 0.82352941 0.84 0.78431373 0.85106383 0.82608696 0.86956522 0.9787234 ] mean value: 0.8453138487587449 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.77777778 0.80769231 0.8 0.77777778 0.84 0.76923077 0.90909091 0.9047619 0.90909091 1. ] mean value: 0.8495422355422355 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.875 0.875 0.83333333 0.875 0.84 0.8 0.8 0.76 0.83333333 0.95833333] mean value: 0.845 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8175 0.8375 0.81666667 0.8175 0.83666667 0.775 0.85833333 0.83833333 0.875 0.97916667] mean value: 0.8451666666666666 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.7 0.72413793 0.68965517 0.7 0.72413793 0.64516129 0.74074074 0.7037037 0.76923077 0.95833333] mean value: 0.7355100871813887 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.73 Accuracy on Blind test: 0.89 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.13182998 0.12682986 0.12640691 0.12741017 0.12682033 0.12823749 0.12678123 0.12986279 0.12703943 0.12754059] mean value: 0.1278758764266968 key: score_time value: [0.01910758 0.01841569 0.01830077 0.01848006 0.01812434 0.01878262 0.01830769 0.01837826 0.01823258 0.01816678] mean value: 0.01842963695526123 key: test_mcc value: [0.755 0.60104076 0.51252158 0.5943247 0.43604918 0.63333333 0.51089422 0.5943247 0.75261781 0.6761234 ] mean value: 0.6066229699611482 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.87755102 0.79591837 0.75510204 0.79591837 0.71428571 0.81632653 0.75510204 0.79591837 0.875 0.83333333] mean value: 0.8014455782312925 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.875 0.80769231 0.76 0.8 0.69565217 0.81632653 0.76923077 0.79166667 0.88 0.84615385] mean value: 0.8041722294268878 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.875 0.75 0.73076923 0.76923077 0.76190476 0.83333333 0.74074074 0.82608696 0.84615385 0.78571429] mean value: 0.7918933924368707 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.875 0.875 0.79166667 0.83333333 0.64 0.8 0.8 0.76 0.91666667 0.91666667] mean value: 0.8208333333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8775 0.7975 0.75583333 0.79666667 0.71583333 0.81666667 0.75416667 0.79666667 0.875 0.83333333] mean value: 0.8019166666666666 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.77777778 0.67741935 0.61290323 0.66666667 0.53333333 0.68965517 0.625 0.65517241 0.78571429 0.73333333] mean value: 0.6756975563677454 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.67 Accuracy on Blind test: 0.86 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01058841 0.01049209 0.01040554 0.01045561 0.01053834 0.01071095 0.01147485 0.01093698 0.01081371 0.01044965] mean value: 0.010686612129211426 key: score_time value: [0.00906849 0.00916314 0.00917268 0.0091629 0.00955749 0.00915098 0.00915647 0.00910068 0.00912881 0.00956321] mean value: 0.00922248363494873 key: test_mcc value: [0.27701416 0.34673805 0.34666667 0.31529953 0.22780857 0.05892557 0.06166667 0.43071846 0.39204616 0.3380617 ] mean value: 0.2794945523300344 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.63265306 0.67346939 0.67346939 0.65306122 0.6122449 0.53061224 0.53061224 0.71428571 0.6875 0.66666667] mean value: 0.6374574829931973 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.65217391 0.66666667 0.67924528 0.59574468 0.56603774 0.53061224 0.70833333 0.72727273 0.69230769] mean value: 0.6485060943907512 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.6 0.68181818 0.66666667 0.62068966 0.63636364 0.53571429 0.54166667 0.73913043 0.64516129 0.64285714] mean value: 0.6310067960364183 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.75 0.625 0.66666667 0.75 0.56 0.6 0.52 0.68 0.83333333 0.75 ] mean value: 0.6735 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.635 0.6725 0.67333333 0.655 0.61333333 0.52916667 0.53083333 0.715 0.6875 0.66666667] mean value: 0.6378333333333334 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.48387097 0.5 0.51428571 0.42424242 0.39473684 0.36111111 0.5483871 0.57142857 0.52941176] mean value: 0.4827474492395096 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.4 Accuracy on Blind test: 0.74 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.87583995 1.81410217 1.99092579 2.11582375 2.0683229 2.05223727 2.08664584 2.0749619 2.19409323 2.02771664] mean value: 2.0300669431686402 key: score_time value: [0.09192443 0.09313083 0.09533286 0.10319018 0.09989047 0.20439339 0.092803 0.10103011 0.12323284 0.10469651] mean value: 0.11096246242523193 key: test_mcc value: [0.80235519 0.88443328 0.715 0.76603235 0.63333333 0.755 0.7145252 0.7202771 0.70894901 0.91666667] mean value: 0.7616572124350064 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.89795918 0.93877551 0.85714286 0.87755102 0.81632653 0.87755102 0.85714286 0.85714286 0.85416667 0.95833333] mean value: 0.8792091836734693 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90196078 0.94117647 0.85714286 0.88461538 0.81632653 0.88 0.8627451 0.85106383 0.85714286 0.95833333] mean value: 0.8810507145575088 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85185185 0.88888889 0.84 0.82142857 0.83333333 0.88 0.84615385 0.90909091 0.84 0.95833333] mean value: 0.8669080734080734 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.95833333 1. 0.875 0.95833333 0.8 0.88 0.88 0.8 0.875 0.95833333] mean value: 0.8985000000000001 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.89916667 0.94 0.8575 0.87916667 0.81666667 0.8775 0.85666667 0.85833333 0.85416667 0.95833333] mean value: 0.87975 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.82142857 0.88888889 0.75 0.79310345 0.68965517 0.78571429 0.75862069 0.74074074 0.75 0.92 ] mean value: 0.7898151797117314 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.79 Accuracy on Blind test: 0.91 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.28804898 1.22133827 1.22952151 1.21814561 1.28684974 1.25389266 1.22434378 1.77762294 2.32905412 0.99304485] mean value: 1.3821862459182739 key: score_time value: [0.14605403 0.16223454 0.17510486 0.18274355 0.15737247 0.18217468 0.25552583 0.20922709 0.21482563 0.25939536] mean value: 0.1944658041000366 key: test_mcc value: [0.83973406 0.88443328 0.67333333 0.83973406 0.63333333 0.755 0.6750504 0.755 0.70894901 0.87576054] mean value: 0.7640328006862502 key: train_mcc value: [0.89566388 0.90948015 0.9227863 0.89566388 0.9230259 0.92333442 0.9230259 0.90949509 0.90942919 0.91387241] mean value: 0.9125777123466576 key: test_accuracy value: [0.91836735 0.93877551 0.83673469 0.91836735 0.81632653 0.87755102 0.83673469 0.87755102 0.85416667 0.9375 ] mean value: 0.8812074829931973 key: train_accuracy value: [0.9476082 0.95444191 0.96127563 0.9476082 0.96127563 0.96127563 0.96127563 0.95444191 0.95454545 0.95681818] mean value: 0.9560566369848831 key: test_fscore value: [0.92 0.94117647 0.83333333 0.92 0.81632653 0.88 0.84615385 0.88 0.85714286 0.93877551] mean value: 0.8832908548034598 key: train_fscore value: [0.94854586 0.95535714 0.96179775 0.94854586 0.96179775 0.96196868 0.96179775 0.95515695 0.95515695 0.95730337] mean value: 0.9567428076100482 key: test_precision value: [0.88461538 0.88888889 0.83333333 0.88461538 0.83333333 0.88 0.81481481 0.88 0.84 0.92 ] mean value: 0.865960113960114 key: train_precision value: [0.9339207 0.93859649 0.95111111 0.9339207 0.94690265 0.94298246 0.94690265 0.93832599 0.94247788 0.94666667] mean value: 0.9421807311867965 key: test_recall value: [0.95833333 1. 0.83333333 0.95833333 0.8 0.88 0.88 0.88 0.875 0.95833333] mean value: 0.9023333333333333 key: train_recall value: [0.96363636 0.97272727 0.97272727 0.96363636 0.97716895 0.98173516 0.97716895 0.97260274 0.96818182 0.96818182] mean value: 0.9717766708177666 key: test_roc_auc value: [0.91916667 0.94 0.83666667 0.91916667 0.81666667 0.8775 0.83583333 0.8775 0.85416667 0.9375 ] mean value: 0.8814166666666667 key: train_roc_auc value: [0.94757161 0.95440017 0.96124948 0.94757161 0.96131175 0.96132213 0.96131175 0.95448319 0.95454545 0.95681818] mean value: 0.9560585305105853 key: test_jcc value: [0.85185185 0.88888889 0.71428571 0.85185185 0.68965517 0.78571429 0.73333333 0.78571429 0.75 0.88461538] mean value: 0.793591076866939 key: train_jcc value: [0.90212766 0.91452991 0.92640693 0.90212766 0.92640693 0.92672414 0.92640693 0.91416309 0.91416309 0.91810345] mean value: 0.9171159779364038 MCC on Blind test: 0.79 Accuracy on Blind test: 0.91 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.023139 0.01131988 0.01124835 0.01410437 0.01195288 0.01177645 0.01184845 0.01120639 0.01069093 0.01091528] mean value: 0.012820196151733399 key: score_time value: [0.00903487 0.00906825 0.01109552 0.01174092 0.01075888 0.01009846 0.0100162 0.01012874 0.01001716 0.0095706 ] mean value: 0.010152959823608398 key: test_mcc value: [0.715 0.38833333 0.39196475 0.51252158 0.34666667 0.59839104 0.63333333 0.55166667 0.58536941 0.58333333] mean value: 0.5306580112563036 key: train_mcc value: [0.59080781 0.61303798 0.60465079 0.62803099 0.60397855 0.62220982 0.59975096 0.59009009 0.5917901 0.56832862] mean value: 0.6012675717929086 key: test_accuracy value: [0.85714286 0.69387755 0.69387755 0.75510204 0.67346939 0.79591837 0.81632653 0.7755102 0.79166667 0.79166667] mean value: 0.7644557823129251 key: train_accuracy value: [0.79498861 0.80637813 0.80182232 0.81321185 0.80182232 0.81093394 0.79954442 0.79498861 0.79545455 0.78409091] mean value: 0.8003235659556844 key: test_fscore value: [0.85714286 0.69387755 0.70588235 0.76 0.68 0.81481481 0.81632653 0.7755102 0.8 0.79166667] mean value: 0.7695220977279801 key: train_fscore value: [0.80088496 0.8098434 0.80794702 0.82017544 0.80449438 0.81348315 0.80357143 0.79638009 0.80088496 0.7816092 ] mean value: 0.8039274012977246 key: test_precision value: [0.84 0.68 0.66666667 0.73076923 0.68 0.75862069 0.83333333 0.79166667 0.76923077 0.79166667] mean value: 0.7541954022988506 key: train_precision value: [0.78017241 0.79735683 0.78540773 0.79237288 0.7920354 0.80088496 0.7860262 0.78923767 0.78017241 0.79069767] mean value: 0.7894364159893563 key: test_recall value: [0.875 0.70833333 0.75 0.79166667 0.68 0.88 0.8 0.76 0.83333333 0.79166667] mean value: 0.787 key: train_recall value: [0.82272727 0.82272727 0.83181818 0.85 0.8173516 0.82648402 0.82191781 0.80365297 0.82272727 0.77272727] mean value: 0.8192133665421336 key: test_roc_auc value: [0.8575 0.69416667 0.695 0.75583333 0.67333333 0.79416667 0.81666667 0.77583333 0.79166667 0.79166667] mean value: 0.7645833333333334 key: train_roc_auc value: [0.79492528 0.80634081 0.80175384 0.81312785 0.80185762 0.81096928 0.79959527 0.7950083 0.79545455 0.78409091] mean value: 0.8003123702781237 key: test_jcc value: [0.75 0.53125 0.54545455 0.61290323 0.51515152 0.6875 0.68965517 0.63333333 0.66666667 0.65517241] mean value: 0.6287086872619408 key: train_jcc value: [0.66789668 0.68045113 0.67777778 0.69516729 0.67293233 0.68560606 0.67164179 0.66165414 0.66789668 0.64150943] mean value: 0.6722533301554774 MCC on Blind test: 0.61 Accuracy on Blind test: 0.84 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.24321389 0.33406544 0.34369111 0.25171113 0.23831272 0.22113085 0.21617317 0.21385956 0.24165797 7.08838511] mean value: 0.9392200946807862 key: score_time value: [0.01211691 0.01195264 0.01236773 0.0118885 0.01287055 0.0118053 0.01144505 0.01131988 0.01220369 0.01409149] mean value: 0.012206172943115235 key: test_mcc value: [0.84852814 0.84852814 0.67333333 0.87833333 0.83920658 0.7145252 0.755 0.79632832 0.79235477 0.91986621] mean value: 0.8066004026133584 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.91836735 0.91836735 0.83673469 0.93877551 0.91836735 0.85714286 0.87755102 0.89795918 0.89583333 0.95833333] mean value: 0.9017431972789116 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.92307692 0.92307692 0.83333333 0.93877551 0.92307692 0.8627451 0.88 0.90196078 0.89795918 0.95652174] mean value: 0.904052641792503 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85714286 0.85714286 0.83333333 0.92 0.88888889 0.84615385 0.88 0.88461538 0.88 1. ] mean value: 0.8847277167277168 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.83333333 0.95833333 0.96 0.88 0.88 0.92 0.91666667 0.91666667] mean value: 0.9265 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.92 0.92 0.83666667 0.93916667 0.9175 0.85666667 0.8775 0.8975 0.89583333 0.95833333] mean value: 0.9019166666666667 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.85714286 0.85714286 0.71428571 0.88461538 0.85714286 0.75862069 0.78571429 0.82142857 0.81481481 0.91666667] mean value: 0.8267574698609181 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.83 Accuracy on Blind test: 0.93 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.14163303 0.13927984 0.1110971 0.13177848 0.04264712 0.04237604 0.12133455 0.12579513 0.10211635 0.13984275] mean value: 0.1097900390625 key: score_time value: [0.02664208 0.03177047 0.0274868 0.01261592 0.01301384 0.01302195 0.02215147 0.02167702 0.04612851 0.03137469] mean value: 0.02458827495574951 key: test_mcc value: [0.55091896 0.51 0.36080239 0.67612782 0.43604918 0.5943247 0.46911585 0.59166667 0.45873171 0.71393289] mean value: 0.5361670172034785 key: train_mcc value: [0.8047578 0.80035377 0.8409621 0.79158975 0.83771907 0.79968448 0.79601091 0.81329265 0.80119274 0.80082773] mean value: 0.808639100168264 key: test_accuracy value: [0.7755102 0.75510204 0.67346939 0.83673469 0.71428571 0.79591837 0.73469388 0.79591837 0.72916667 0.85416667] mean value: 0.7664965986394557 key: train_accuracy value: [0.90205011 0.89977221 0.92027335 0.8952164 0.91799544 0.89977221 0.89749431 0.90660592 0.9 0.9 ] mean value: 0.9039179954441914 key: test_fscore value: [0.76595745 0.75 0.7037037 0.84 0.69565217 0.79166667 0.74509804 0.8 0.73469388 0.8627451 ] mean value: 0.7689517005897848 key: train_fscore value: [0.90423163 0.90222222 0.92170022 0.89823009 0.92035398 0.90045249 0.89977728 0.90702948 0.90265487 0.90222222] mean value: 0.9058874482042989 key: test_precision value: [0.7826087 0.75 0.63333333 0.80769231 0.76190476 0.82608696 0.73076923 0.8 0.72 0.81481481] mean value: 0.7627210100688362 key: train_precision value: [0.88646288 0.8826087 0.90748899 0.875 0.89270386 0.89237668 0.87826087 0.9009009 0.87931034 0.8826087 ] mean value: 0.8877721919753557 key: test_recall value: [0.75 0.75 0.79166667 0.875 0.64 0.76 0.76 0.8 0.75 0.91666667] mean value: 0.7793333333333333 key: train_recall value: [0.92272727 0.92272727 0.93636364 0.92272727 0.94977169 0.9086758 0.92237443 0.91324201 0.92727273 0.92272727] mean value: 0.9248609381486094 key: test_roc_auc value: [0.775 0.755 0.67583333 0.8375 0.71583333 0.79666667 0.73416667 0.79583333 0.72916667 0.85416667] mean value: 0.7669166666666667 key: train_roc_auc value: [0.90200291 0.8997198 0.92023661 0.89515359 0.91806766 0.89979244 0.89755085 0.906621 0.9 0.9 ] mean value: 0.9039144873391449 key: test_jcc value: [0.62068966 0.6 0.54285714 0.72413793 0.53333333 0.65517241 0.59375 0.66666667 0.58064516 0.75862069] mean value: 0.6275872993802638 key: train_jcc value: [0.82520325 0.82186235 0.85477178 0.81526104 0.85245902 0.81893004 0.81781377 0.82987552 0.82258065 0.82186235] mean value: 0.8280619763359249 MCC on Blind test: 0.64 Accuracy on Blind test: 0.85 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01937389 0.05064607 0.05823445 0.03910899 0.03520322 0.01185274 0.011024 0.01107717 0.01257539 0.0136404 ] mean value: 0.026273632049560548 key: score_time value: [0.02086258 0.0563364 0.03695273 0.0254395 0.01412416 0.00974655 0.00964689 0.01003909 0.01183605 0.01182985] mean value: 0.020681381225585938 key: test_mcc value: [0.6750504 0.47 0.47 0.46911585 0.43071846 0.42833333 0.43071846 0.59166667 0.58536941 0.5 ] mean value: 0.5050972578617717 key: train_mcc value: [0.51263368 0.54898298 0.56723544 0.54913848 0.55809465 0.5626401 0.57175176 0.53074991 0.54095939 0.54091468] mean value: 0.5483101071639697 key: test_accuracy value: [0.83673469 0.73469388 0.73469388 0.73469388 0.71428571 0.71428571 0.71428571 0.79591837 0.79166667 0.75 ] mean value: 0.752125850340136 key: train_accuracy value: [0.75626424 0.77448747 0.78359909 0.77448747 0.77904328 0.78132118 0.78587699 0.76537585 0.77045455 0.77045455] mean value: 0.7741364671774694 key: test_fscore value: [0.82608696 0.73469388 0.73469388 0.72340426 0.70833333 0.72 0.70833333 0.8 0.8 0.75 ] mean value: 0.7505545633609596 key: train_fscore value: [0.75955056 0.77448747 0.78555305 0.77241379 0.77904328 0.78082192 0.78538813 0.76430206 0.77200903 0.77097506] mean value: 0.7744544345207075 key: test_precision value: [0.86363636 0.72 0.72 0.73913043 0.73913043 0.72 0.73913043 0.8 0.76923077 0.75 ] mean value: 0.7560258437214958 key: train_precision value: [0.75111111 0.77625571 0.78026906 0.78139535 0.77727273 0.78082192 0.78538813 0.76605505 0.76681614 0.76923077] mean value: 0.7734615957541756 key: test_recall value: [0.79166667 0.75 0.75 0.70833333 0.68 0.72 0.68 0.8 0.83333333 0.75 ] mean value: 0.7463333333333333 key: train_recall value: [0.76818182 0.77272727 0.79090909 0.76363636 0.78082192 0.78082192 0.78538813 0.76255708 0.77727273 0.77272727] mean value: 0.7755043586550436 key: test_roc_auc value: [0.83583333 0.735 0.735 0.73416667 0.715 0.71416667 0.715 0.79583333 0.79166667 0.75 ] mean value: 0.7521666666666667 key: train_roc_auc value: [0.75623703 0.77449149 0.7835824 0.77451225 0.77904732 0.78132005 0.78587588 0.76536945 0.77045455 0.77045455] mean value: 0.774134495641345 key: test_jcc value: [0.7037037 0.58064516 0.58064516 0.56666667 0.5483871 0.5625 0.5483871 0.66666667 0.66666667 0.6 ] mean value: 0.6024268219832736 key: train_jcc value: [0.61231884 0.63197026 0.64684015 0.62921348 0.6380597 0.64044944 0.64661654 0.61851852 0.62867647 0.62730627] mean value: 0.6319969675865363 MCC on Blind test: 0.55 Accuracy on Blind test: 0.82 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01522183 0.04666972 0.06443954 0.04834628 0.04992676 0.06265259 0.05166793 0.04743147 0.0382247 0.01697612] mean value: 0.04415569305419922 key: score_time value: [0.01111174 0.019274 0.01320052 0.01206517 0.01248288 0.01242256 0.01222825 0.01206899 0.01885581 0.0120883 ] mean value: 0.013579821586608887 key: test_mcc value: [0.67612782 0.61216708 0.3492027 0.7202771 0.47140452 0.71889189 0.39836231 0.71889189 0.57735027 0.30151134] mean value: 0.5544186940394943 key: train_mcc value: [0.64178459 0.73768986 0.77430537 0.70066939 0.74181793 0.72352791 0.68212557 0.66949315 0.57783716 0.24731124] mean value: 0.6496562173213511 key: test_accuracy value: [0.83673469 0.79591837 0.67346939 0.85714286 0.73469388 0.85714286 0.69387755 0.85714286 0.75 0.58333333] mean value: 0.7639455782312925 key: train_accuracy value: [0.81321185 0.86332574 0.88610478 0.84510251 0.86332574 0.85876993 0.83143508 0.83371298 0.76363636 0.56136364] mean value: 0.811998861047836 key: test_fscore value: [0.84 0.81481481 0.63636364 0.8627451 0.75471698 0.86792453 0.66666667 0.86792453 0.8 0.70588235] mean value: 0.781703860656136 key: train_fscore value: [0.83196721 0.87447699 0.88207547 0.85774059 0.87551867 0.86695279 0.80829016 0.83956044 0.80377358 0.69413629] mean value: 0.8334492191440513 key: test_precision value: [0.80769231 0.73333333 0.7 0.81481481 0.71428571 0.82142857 0.75 0.82142857 0.66666667 0.54545455] mean value: 0.7375104525104524 key: train_precision value: [0.75746269 0.81007752 0.91666667 0.79457364 0.80228137 0.81781377 0.93413174 0.80932203 0.68709677 0.53284672] mean value: 0.7862272909975274 key: test_recall value: [0.875 0.91666667 0.58333333 0.91666667 0.8 0.92 0.6 0.92 1. 1. ] mean value: 0.8531666666666666 key: train_recall value: [0.92272727 0.95 0.85 0.93181818 0.96347032 0.92237443 0.71232877 0.87214612 0.96818182 0.99545455] mean value: 0.9088501452885014 key: test_roc_auc value: [0.8375 0.79833333 0.67166667 0.85833333 0.73333333 0.85583333 0.69583333 0.85583333 0.75 0.58333333] mean value: 0.7639999999999999 key: train_roc_auc value: [0.81296181 0.86312785 0.88618721 0.84490452 0.86355334 0.85891449 0.83116438 0.83380033 0.76363636 0.56136364] mean value: 0.8119613947696139 key: test_jcc value: [0.72413793 0.6875 0.46666667 0.75862069 0.60606061 0.76666667 0.5 0.76666667 0.66666667 0.54545455] mean value: 0.6488440438871473 key: train_jcc value: [0.7122807 0.77695167 0.78902954 0.75091575 0.77859779 0.76515152 0.67826087 0.72348485 0.67192429 0.5315534 ] mean value: 0.7178150368856082 MCC on Blind test: 0.54 Accuracy on Blind test: 0.83 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.02800584 0.04849577 0.05744505 0.06142521 0.03895092 0.02554607 0.02476668 0.02866244 0.02244735 0.02189159] mean value: 0.03576369285583496 key: score_time value: [0.01322865 0.01706743 0.02451825 0.02228951 0.01022005 0.01208138 0.01217604 0.01215816 0.01213026 0.01219058] mean value: 0.014806032180786133 key: test_mcc value: [0.69302938 0.63333333 0.3932917 0.68145382 0.21189236 0.48412292 0.47404284 0.59166667 0.58536941 0.71393289] mean value: 0.5462135324503263 key: train_mcc value: [0.51702284 0.76028119 0.74181793 0.71419653 0.53065219 0.45848334 0.77681141 0.74500769 0.71877954 0.72874979] mean value: 0.6691802430417745 key: test_accuracy value: [0.83673469 0.81632653 0.69387755 0.83673469 0.57142857 0.69387755 0.73469388 0.79591837 0.79166667 0.85416667] mean value: 0.7625425170068028 key: train_accuracy value: [0.72209567 0.87927107 0.86332574 0.85193622 0.73120729 0.67881549 0.88610478 0.87243736 0.85454545 0.86363636] mean value: 0.8203375440049699 key: test_fscore value: [0.80952381 0.81632653 0.65116279 0.81818182 0.36363636 0.76923077 0.72340426 0.8 0.7826087 0.8627451 ] mean value: 0.7396820130893219 key: train_fscore value: [0.62804878 0.88351648 0.84848485 0.83870968 0.64242424 0.75478261 0.87922705 0.87330317 0.84158416 0.86784141] mean value: 0.8057922429696769 key: test_precision value: [0.94444444 0.8 0.73684211 0.9 0.75 0.625 0.77272727 0.8 0.81818182 0.81481481] mean value: 0.7962010455431509 key: train_precision value: [0.9537037 0.85531915 0.95454545 0.92349727 0.95495495 0.60955056 0.93333333 0.86547085 0.92391304 0.84188034] mean value: 0.8816168662407472 key: test_recall value: [0.70833333 0.83333333 0.58333333 0.75 0.24 1. 0.68 0.8 0.75 0.91666667] mean value: 0.7261666666666666 key: train_recall value: [0.46818182 0.91363636 0.76363636 0.76818182 0.48401826 0.99086758 0.83105023 0.88127854 0.77272727 0.89545455] mean value: 0.7769032793690328 key: test_roc_auc value: [0.83416667 0.81666667 0.69166667 0.835 0.57833333 0.6875 0.73583333 0.79583333 0.79166667 0.85416667] mean value: 0.7620833333333333 key: train_roc_auc value: [0.72267538 0.87919261 0.86355334 0.85212744 0.7306455 0.6795247 0.88597966 0.87245745 0.85454545 0.86363636] mean value: 0.8204337899543379 key: test_jcc value: [0.68 0.68965517 0.48275862 0.69230769 0.22222222 0.625 0.56666667 0.66666667 0.64285714 0.75862069] mean value: 0.6026754873479011 key: train_jcc value: [0.45777778 0.79133858 0.73684211 0.72222222 0.47321429 0.60614525 0.78448276 0.7751004 0.72649573 0.76653696] mean value: 0.6840156076754643 MCC on Blind test: 0.63 Accuracy on Blind test: 0.83 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.19333434 0.1770587 0.17805886 0.17734861 0.17584944 0.17565274 0.17738605 0.1771369 0.17707872 0.18207049] mean value: 0.17909748554229737 key: score_time value: [0.0158093 0.01619959 0.01555848 0.01538253 0.01540327 0.01564074 0.01553726 0.01590323 0.01605821 0.01594472] mean value: 0.015743732452392578 key: test_mcc value: [0.88443328 0.75793094 0.63272208 0.87833333 0.68353656 0.715 0.755 0.7145252 0.79235477 0.87576054] mean value: 0.7689596704149929 key: train_mcc value: [0.95036567 0.97747332 0.96819862 0.96371187 0.9682006 0.96836207 0.95480125 0.9635941 0.96827185 0.95470327] mean value: 0.9637682614354717 key: test_accuracy value: [0.93877551 0.87755102 0.81632653 0.93877551 0.83673469 0.85714286 0.87755102 0.85714286 0.89583333 0.9375 ] mean value: 0.8833333333333333 key: train_accuracy value: [0.97494305 0.98861048 0.98405467 0.98177677 0.98405467 0.98405467 0.97722096 0.98177677 0.98409091 0.97727273] mean value: 0.9817855663698488 key: test_fscore value: [0.94117647 0.88 0.80851064 0.93877551 0.82608696 0.85714286 0.88 0.8627451 0.89795918 0.93877551] mean value: 0.8831172224671552 key: train_fscore value: [0.9753915 0.98876404 0.98419865 0.98198198 0.98412698 0.98419865 0.97747748 0.98181818 0.98419865 0.97747748] mean value: 0.9819633583501938 key: test_precision value: [0.88888889 0.84615385 0.82608696 0.92 0.9047619 0.875 0.88 0.84615385 0.88 0.92 ] mean value: 0.8787045442480225 key: train_precision value: [0.96035242 0.97777778 0.97757848 0.97321429 0.97747748 0.97321429 0.96444444 0.97737557 0.97757848 0.96875 ] mean value: 0.9727763210319266 key: test_recall value: [1. 0.91666667 0.79166667 0.95833333 0.76 0.84 0.88 0.88 0.91666667 0.95833333] mean value: 0.8901666666666667 key: train_recall value: [0.99090909 1. 0.99090909 0.99090909 0.99086758 0.99543379 0.99086758 0.98630137 0.99090909 0.98636364] mean value: 0.9913470319634703 key: test_roc_auc value: [0.94 0.87833333 0.81583333 0.93916667 0.83833333 0.8575 0.8775 0.85666667 0.89583333 0.9375 ] mean value: 0.8836666666666667 key: train_roc_auc value: [0.9749066 0.98858447 0.98403902 0.98175592 0.98407015 0.98408053 0.97725197 0.98178705 0.98409091 0.97727273] mean value: 0.9817839352428394 key: test_jcc value: [0.88888889 0.78571429 0.67857143 0.88461538 0.7037037 0.75 0.78571429 0.75862069 0.81481481 0.88461538] mean value: 0.7935258866293349 key: train_jcc value: [0.95196507 0.97777778 0.96888889 0.96460177 0.96875 0.96888889 0.95594714 0.96428571 0.96888889 0.95594714] mean value: 0.9645941267271599 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.06887841 0.06254005 0.09455132 0.07779312 0.06916332 0.08109474 0.09326601 0.07408905 0.09191942 0.07600975] mean value: 0.07893052101135253 key: score_time value: [0.02582598 0.0283165 0.02938533 0.03985167 0.02802324 0.02642775 0.03644276 0.03282738 0.02019882 0.02627158] mean value: 0.029357099533081056 key: test_mcc value: [0.8136762 0.75793094 0.67333333 0.80235519 0.67612782 0.55612092 0.755 0.7145252 0.83624201 0.9591663 ] mean value: 0.7544477928372326 key: train_mcc value: [0.99545455 0.97723074 0.96811889 0.99088834 0.98642422 0.96819862 0.98177667 0.98177667 0.9773636 0.98181818] mean value: 0.9809050481225259 key: test_accuracy value: [0.89795918 0.87755102 0.83673469 0.89795918 0.83673469 0.7755102 0.87755102 0.85714286 0.91666667 0.97916667] mean value: 0.875297619047619 key: train_accuracy value: [0.9977221 0.98861048 0.98405467 0.99544419 0.99316629 0.98405467 0.99088838 0.99088838 0.98863636 0.99090909] mean value: 0.9904374611720853 key: test_fscore value: [0.90566038 0.88 0.83333333 0.90196078 0.83333333 0.76595745 0.88 0.8627451 0.92 0.9787234 ] mean value: 0.8761713777441928 key: train_fscore value: [0.9977221 0.98866213 0.98412698 0.99545455 0.99310345 0.98390805 0.99086758 0.99086758 0.98871332 0.99090909] mean value: 0.9904334820036527 key: test_precision value: [0.82758621 0.84615385 0.83333333 0.85185185 0.86956522 0.81818182 0.88 0.84615385 0.88461538 1. ] mean value: 0.8657441504577936 key: train_precision value: [1. 0.98642534 0.98190045 0.99545455 1. 0.99074074 0.99086758 0.99086758 0.98206278 0.99090909] mean value: 0.9909228109045991 key: test_recall value: [1. 0.91666667 0.83333333 0.95833333 0.8 0.72 0.88 0.88 0.95833333 0.95833333] mean value: 0.8905000000000001 key: train_recall value: [0.99545455 0.99090909 0.98636364 0.99545455 0.98630137 0.97716895 0.99086758 0.99086758 0.99545455 0.99090909] mean value: 0.989975093399751 key: test_roc_auc value: [0.9 0.87833333 0.83666667 0.89916667 0.8375 0.77666667 0.8775 0.85666667 0.91666667 0.97916667] mean value: 0.8758333333333334 key: train_roc_auc value: [0.99772727 0.98860523 0.9840494 0.99544417 0.99315068 0.98403902 0.99088834 0.99088834 0.98863636 0.99090909] mean value: 0.990433789954338 key: test_jcc value: [0.82758621 0.78571429 0.71428571 0.82142857 0.71428571 0.62068966 0.78571429 0.75862069 0.85185185 0.95833333] mean value: 0.7838510308337895 key: train_jcc value: [0.99545455 0.97757848 0.96875 0.99095023 0.98630137 0.96832579 0.98190045 0.98190045 0.97767857 0.98198198] mean value: 0.9810821867141358 MCC on Blind test: 0.77 Accuracy on Blind test: 0.9 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.16271353 0.23615193 0.19280815 0.20062351 0.17626858 0.1758728 0.16275883 0.17226195 0.17328191 0.19703627] mean value: 0.18497774600982667 key: score_time value: [0.02630448 0.02443695 0.03032565 0.0239799 0.02425122 0.02434754 0.0243566 0.02626634 0.02392149 0.03055239] mean value: 0.02587425708770752 key: test_mcc value: [0.47 0.38890873 0.43071846 0.225 0.43071846 0.38890873 0.30666667 0.47 0.58333333 0.58536941] mean value: 0.42796237929596553 key: train_mcc value: [0.97731142 0.9863426 0.98177667 0.99092928 0.99545445 0.9863426 0.98634288 0.9818178 0.98185876 0.98637383] mean value: 0.9854550288338422 key: test_accuracy value: [0.73469388 0.69387755 0.71428571 0.6122449 0.71428571 0.69387755 0.65306122 0.73469388 0.79166667 0.79166667] mean value: 0.7134353741496599 key: train_accuracy value: [0.98861048 0.99316629 0.99088838 0.99544419 0.9977221 0.99316629 0.99316629 0.99088838 0.99090909 0.99318182] mean value: 0.9927143300890453 key: test_fscore value: [0.73469388 0.66666667 0.72 0.6122449 0.70833333 0.71698113 0.65306122 0.73469388 0.79166667 0.7826087 ] mean value: 0.7120950371945333 key: train_fscore value: [0.98871332 0.99319728 0.99090909 0.99547511 0.99771167 0.99313501 0.99316629 0.99090909 0.99095023 0.99319728] mean value: 0.9927364366230393 key: test_precision value: [0.72 0.71428571 0.69230769 0.6 0.73913043 0.67857143 0.66666667 0.75 0.79166667 0.81818182] mean value: 0.7170810421462596 key: train_precision value: [0.98206278 0.99095023 0.99090909 0.99099099 1. 0.99541284 0.99090909 0.98642534 0.98648649 0.99095023] mean value: 0.9905097075456619 key: test_recall value: [0.75 0.625 0.75 0.625 0.68 0.76 0.64 0.72 0.79166667 0.75 ] mean value: 0.7091666666666667 key: train_recall value: [0.99545455 0.99545455 0.99090909 1. 0.99543379 0.99086758 0.99543379 0.99543379 0.99545455 0.99545455] mean value: 0.9949896222498962 key: test_roc_auc value: [0.735 0.6925 0.715 0.6125 0.715 0.6925 0.65333333 0.735 0.79166667 0.79166667] mean value: 0.7134166666666667 key: train_roc_auc value: [0.98859485 0.99316106 0.99088834 0.99543379 0.99771689 0.99316106 0.99317144 0.99089871 0.99090909 0.99318182] mean value: 0.9927117061021171 key: test_jcc value: [0.58064516 0.5 0.5625 0.44117647 0.5483871 0.55882353 0.48484848 0.58064516 0.65517241 0.64285714] mean value: 0.555505546085357 key: train_jcc value: [0.97767857 0.98648649 0.98198198 0.99099099 0.99543379 0.98636364 0.98642534 0.98198198 0.98206278 0.98648649] mean value: 0.9855892045310047 MCC on Blind test: 0.49 Accuracy on Blind test: 0.79 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.71926641 0.71162152 0.71367073 0.71375799 0.71037388 0.7147007 0.70813131 0.71393132 0.70967293 0.70716023] mean value: 0.7122287034988404 key: score_time value: [0.01039338 0.01002169 0.0094986 0.00955582 0.01005411 0.00989556 0.00942087 0.00945616 0.00969458 0.00933266] mean value: 0.009732341766357422 key: test_mcc value: [0.8136762 0.80235519 0.67333333 0.92153718 0.83920658 0.6363961 0.755 0.79632832 0.75261781 1. ] mean value: 0.7990450713750232 key: train_mcc value: [0.99545445 1. 1. 0.99545445 1. 1. 1. 1. 1. 1. ] mean value: 0.9990908902647659 key: test_accuracy value: [0.89795918 0.89795918 0.83673469 0.95918367 0.91836735 0.81632653 0.87755102 0.89795918 0.875 1. ] mean value: 0.897704081632653 key: train_accuracy value: [0.9977221 1. 1. 0.9977221 1. 1. 1. 1. 1. 1. ] mean value: 0.9995444191343964 key: test_fscore value: [0.90566038 0.90196078 0.83333333 0.96 0.92307692 0.83018868 0.88 0.90196078 0.88 1. ] mean value: 0.9016180881641481 key: train_fscore value: [0.99773243 1. 1. 0.99773243 1. 1. 1. 1. 1. 1. ] mean value: 0.999546485260771 key: test_precision value: [0.82758621 0.85185185 0.83333333 0.92307692 0.88888889 0.78571429 0.88 0.88461538 0.84615385 1. ] mean value: 0.8721220720531065 key: train_precision value: [0.99547511 1. 1. 0.99547511 1. 1. 1. 1. 1. 1. ] mean value: 0.9990950226244344 key: test_recall value: [1. 0.95833333 0.83333333 1. 0.96 0.88 0.88 0.92 0.91666667 1. ] mean value: 0.9348333333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.89916667 0.83666667 0.96 0.9175 0.815 0.8775 0.8975 0.875 1. ] mean value: 0.8978333333333334 key: train_roc_auc value: [0.99771689 1. 1. 0.99771689 1. 1. 1. 1. 1. 1. ] mean value: 0.9995433789954338 key: test_jcc value: [0.82758621 0.82142857 0.71428571 0.92307692 0.85714286 0.70967742 0.78571429 0.82142857 0.78571429 1. ] mean value: 0.8246054835042599 key: train_jcc value: [0.99547511 1. 1. 0.99547511 1. 1. 1. 1. 1. 1. ] mean value: 0.9990950226244344 MCC on Blind test: 0.81 Accuracy on Blind test: 0.92 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03085756 0.03368568 0.03450561 0.03497577 0.03449321 0.05795598 0.05796933 0.05467415 0.05438256 0.0380888 ] mean value: 0.043158864974975585 key: score_time value: [0.01293373 0.01301718 0.03573799 0.01297569 0.01292992 0.01947093 0.01932955 0.0131321 0.01314855 0.01365209] mean value: 0.016632771492004393 key: test_mcc value: [-0.05990423 -0.01162476 0.05 0.10143783 0.27326049 0.41666667 0.35343496 0.22571524 0.19245009 0.35355339] mean value: 0.1894989672081178 key: train_mcc value: [0.41406034 0.4586092 0.49471053 0.5160817 0.48532893 0.65068532 0.78556244 0.5944114 0.48932261 0.51063195] mean value: 0.5399404434769084 key: test_accuracy value: [0.46938776 0.48979592 0.51020408 0.53061224 0.6122449 0.65306122 0.65306122 0.6122449 0.58333333 0.66666667] mean value: 0.5780612244897959 key: train_accuracy value: [0.64692483 0.67425968 0.69703872 0.71070615 0.69020501 0.79726651 0.88154897 0.76082005 0.69318182 0.70681818] mean value: 0.725876993166287 key: test_fscore value: [0.58064516 0.59016393 0.63636364 0.64615385 0.70769231 0.74626866 0.73015873 0.65454545 0.66666667 0.71428571] mean value: 0.6672944108299326 key: train_fscore value: [0.7394958 0.75471698 0.76788831 0.77601411 0.7630662 0.83111954 0.89387755 0.80662983 0.76521739 0.77328647] mean value: 0.787131218670251 key: test_precision value: [0.47368421 0.48648649 0.5 0.51219512 0.575 0.5952381 0.60526316 0.6 0.55555556 0.625 ] mean value: 0.552842262765241 key: train_precision value: [0.58666667 0.60606061 0.62322946 0.63400576 0.61690141 0.71103896 0.80811808 0.67592593 0.61971831 0.63037249] mean value: 0.6512037677464642 key: test_recall value: [0.75 0.75 0.875 0.875 0.92 1. 0.92 0.72 0.83333333 0.83333333] mean value: 0.8476666666666667 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.475 0.495 0.5175 0.5375 0.60583333 0.64583333 0.6475 0.61 0.58333333 0.66666667] mean value: 0.5784166666666667 key: train_roc_auc value: [0.64611872 0.67351598 0.69634703 0.71004566 0.69090909 0.79772727 0.88181818 0.76136364 0.69318182 0.70681818] mean value: 0.7257845579078456 key: test_jcc value: [0.40909091 0.41860465 0.46666667 0.47727273 0.54761905 0.5952381 0.575 0.48648649 0.5 0.55555556] mean value: 0.5031534139092279 key: train_jcc value: [0.58666667 0.60606061 0.62322946 0.63400576 0.61690141 0.71103896 0.80811808 0.67592593 0.61971831 0.63037249] mean value: 0.6512037677464642 MCC on Blind test: 0.16 Accuracy on Blind test: 0.43 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02436137 0.03441906 0.03893018 0.03864837 0.0392921 0.03875279 0.0352807 0.03918028 0.03790474 0.03902316] mean value: 0.03657927513122559 key: score_time value: [0.01807237 0.02321362 0.01447272 0.02220654 0.02380419 0.02437162 0.02390718 0.0213306 0.01967859 0.02034736] mean value: 0.021140480041503908 key: test_mcc value: [0.755 0.51252158 0.47404284 0.715 0.51252158 0.71889189 0.46911585 0.7145252 0.58536941 0.70894901] mean value: 0.6165937358298096 key: train_mcc value: [0.76028119 0.74681841 0.80164338 0.74562127 0.79257426 0.75008329 0.76517092 0.76035543 0.77477899 0.72441305] mean value: 0.7621740179645166 key: test_accuracy value: [0.87755102 0.75510204 0.73469388 0.85714286 0.75510204 0.85714286 0.73469388 0.85714286 0.79166667 0.85416667] mean value: 0.8074404761904762 key: train_accuracy value: [0.87927107 0.87243736 0.89977221 0.87243736 0.8952164 0.87471526 0.88154897 0.87927107 0.88636364 0.86136364] mean value: 0.880239697659971 key: test_fscore value: [0.875 0.76 0.74509804 0.85714286 0.75 0.86792453 0.74509804 0.8627451 0.8 0.85714286] mean value: 0.8120151419058189 key: train_fscore value: [0.88351648 0.87719298 0.90350877 0.87555556 0.89867841 0.87695749 0.88546256 0.88300221 0.89035088 0.86593407] mean value: 0.8840159407660725 key: test_precision value: [0.875 0.73076923 0.7037037 0.84 0.7826087 0.82142857 0.73076923 0.84615385 0.76923077 0.84 ] mean value: 0.7939664047707526 key: train_precision value: [0.85531915 0.84745763 0.87288136 0.85652174 0.86808511 0.85964912 0.85531915 0.85470085 0.86016949 0.83829787] mean value: 0.8568401467810323 key: test_recall value: [0.875 0.79166667 0.79166667 0.875 0.72 0.92 0.76 0.88 0.83333333 0.875 ] mean value: 0.8321666666666667 key: train_recall value: [0.91363636 0.90909091 0.93636364 0.89545455 0.93150685 0.89497717 0.91780822 0.91324201 0.92272727 0.89545455] mean value: 0.9130261519302615 key: test_roc_auc value: [0.8775 0.75583333 0.73583333 0.8575 0.75583333 0.85583333 0.73416667 0.85666667 0.79166667 0.85416667] mean value: 0.8074999999999999 key: train_roc_auc value: [0.87919261 0.87235367 0.89968867 0.87238481 0.89529888 0.87476131 0.88163138 0.87934828 0.88636364 0.86136364] mean value: 0.8802386882523869 key: test_jcc value: [0.77777778 0.61290323 0.59375 0.75 0.6 0.76666667 0.59375 0.75862069 0.66666667 0.75 ] mean value: 0.6870135026572735 key: train_jcc value: [0.79133858 0.78125 0.824 0.77865613 0.816 0.78087649 0.7944664 0.79051383 0.80237154 0.76356589] mean value: 0.7923038873312278 MCC on Blind test: 0.68 Accuracy on Blind test: 0.87 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.34428215 0.34850931 0.45896435 0.38414645 0.37303638 0.35414553 0.46823978 0.45267653 0.56025004 0.50578809] mean value: 0.4250038623809814 key: score_time value: [0.01888943 0.01431537 0.02232623 0.02402639 0.02232146 0.02438354 0.03082514 0.03807449 0.03736305 0.03020906] mean value: 0.026273417472839355 key: test_mcc value: [0.79666667 0.56448787 0.47404284 0.715 0.51252158 0.71889189 0.51089422 0.7145252 0.62994079 0.71393289] mean value: 0.6350903953454979 key: train_mcc value: [0.68873977 0.70769795 0.80164338 0.69247872 0.79257426 0.69743616 0.73038867 0.76035543 0.71907224 0.71266318] mean value: 0.7303049743531872 key: test_accuracy value: [0.89795918 0.7755102 0.73469388 0.85714286 0.75510204 0.85714286 0.75510204 0.85714286 0.8125 0.85416667] mean value: 0.8156462585034013 key: train_accuracy value: [0.8428246 0.85193622 0.89977221 0.84510251 0.8952164 0.84738041 0.86332574 0.87927107 0.85681818 0.85454545] mean value: 0.8636192793539035 key: test_fscore value: [0.89795918 0.79245283 0.74509804 0.85714286 0.75 0.86792453 0.76923077 0.8627451 0.82352941 0.8627451 ] mean value: 0.8228827815596486 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:176: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:179: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.85032538 0.85961123 0.90350877 0.85152838 0.89867841 0.85339168 0.86956522 0.88300221 0.86509636 0.86147186] mean value: 0.869617951203053 key: test_precision value: [0.88 0.72413793 0.7037037 0.84 0.7826087 0.82142857 0.74074074 0.84615385 0.77777778 0.81481481] mean value: 0.7931366081306112 key: train_precision value: [0.81327801 0.81893004 0.87288136 0.81932773 0.86808511 0.81932773 0.82987552 0.85470085 0.81781377 0.82231405] mean value: 0.8336534162093092 key: test_recall value: [0.91666667 0.875 0.79166667 0.875 0.72 0.92 0.8 0.88 0.875 0.91666667] mean value: 0.857 key: train_recall value: [0.89090909 0.90454545 0.93636364 0.88636364 0.93150685 0.89041096 0.91324201 0.91324201 0.91818182 0.90454545] mean value: 0.9089310917393109 key: test_roc_auc value: [0.89833333 0.7775 0.73583333 0.8575 0.75583333 0.85583333 0.75416667 0.85666667 0.8125 0.85416667] mean value: 0.8158333333333333 key: train_roc_auc value: [0.84271482 0.85181611 0.89968867 0.8450083 0.89529888 0.84747821 0.86343919 0.87934828 0.85681818 0.85454545] mean value: 0.8636156081361561 key: test_jcc value: [0.81481481 0.65625 0.59375 0.75 0.6 0.76666667 0.625 0.75862069 0.7 0.75862069] mean value: 0.7023722860791826 key: train_jcc value: [0.73962264 0.75378788 0.824 0.74144487 0.816 0.74427481 0.76923077 0.79051383 0.76226415 0.75665399] mean value: 0.7697792942939468 MCC on Blind test: 0.68 Accuracy on Blind test: 0.86 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.13643479 0.17528868 0.20533156 0.28455257 0.26884556 0.20184398 0.19602919 0.17463088 0.19731784 0.15040684] mean value: 0.19906818866729736 key: score_time value: [0.02278376 0.02012134 0.04163885 0.03131413 0.02843094 0.01659536 0.03433967 0.02756572 0.02681756 0.0273664 ] mean value: 0.027697372436523437 key: test_mcc value: [0.65489945 0.65489945 0.72727273 0.79708114 0.77041694 0.72760688 0.74456392 0.67161876 0.74456392 0.74456392] mean value: 0.7237487107624094 key: train_mcc value: [0.75948704 0.78052753 0.76025862 0.77108311 0.75585691 0.76998793 0.76207246 0.76521684 0.75396519 0.7645048 ] mean value: 0.7642960432414326 key: test_accuracy value: [0.82706767 0.82706767 0.86363636 0.89393939 0.87878788 0.86363636 0.87121212 0.83333333 0.87121212 0.87121212] mean value: 0.8601105035315562 key: train_accuracy value: [0.87888982 0.88898234 0.8789916 0.88487395 0.87731092 0.88403361 0.88067227 0.88151261 0.87647059 0.88151261] mean value: 0.8813250312740739 key: test_fscore value: [0.82962963 0.82442748 0.86363636 0.90140845 0.88888889 0.86567164 0.87591241 0.84285714 0.87591241 0.87591241] mean value: 0.8644256824700698 key: train_fscore value: [0.88292683 0.89320388 0.88349515 0.88816327 0.88071895 0.88798701 0.88322368 0.88582996 0.87960688 0.88508557] mean value: 0.885024118883971 key: test_precision value: [0.8115942 0.84375 0.86363636 0.84210526 0.82051282 0.85294118 0.84507042 0.7972973 0.84507042 0.84507042] mean value: 0.8367048391579148 key: train_precision value: [0.85511811 0.85981308 0.85179407 0.86349206 0.85691574 0.85871272 0.8647343 0.8546875 0.85782748 0.85917722] mean value: 0.8582272275472678 key: test_recall value: [0.84848485 0.80597015 0.86363636 0.96969697 0.96969697 0.87878788 0.90909091 0.89393939 0.90909091 0.90909091] mean value: 0.8957485300768883 key: train_recall value: [0.91260504 0.92929293 0.91764706 0.91428571 0.90588235 0.91932773 0.90252101 0.91932773 0.90252101 0.91260504] mean value: 0.9136015618368559 key: test_roc_auc value: [0.8272275 0.8272275 0.86363636 0.89393939 0.87878788 0.86363636 0.87121212 0.83333333 0.87121212 0.87121212] mean value: 0.8601424694708277 key: train_roc_auc value: [0.87886144 0.88901621 0.8789916 0.88487395 0.87731092 0.88403361 0.88067227 0.88151261 0.87647059 0.88151261] mean value: 0.8813255807373455 key: test_jcc value: [0.70886076 0.7012987 0.76 0.82051282 0.8 0.76315789 0.77922078 0.72839506 0.77922078 0.77922078] mean value: 0.7619887575432768 key: train_jcc value: [0.79039301 0.80701754 0.79130435 0.79882526 0.78686131 0.79854015 0.79086892 0.79505814 0.78508772 0.79385965] mean value: 0.7937816054460703 MCC on Blind test: 0.69 Accuracy on Blind test: 0.88 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [4.39447451 2.87602019 1.87464285 1.19950271 2.72753072 2.3022213 2.38066149 2.41723609 2.14946175 2.22808075] mean value: 2.4549832344055176 key: score_time value: [0.03131342 0.01659775 0.0143137 0.01979375 0.02363372 0.02368855 0.04590464 0.0226686 0.02403808 0.02302599] mean value: 0.02449781894683838 key: test_mcc value: [0.66915423 0.65616669 0.75897093 0.76642417 0.81060226 0.71285802 0.72760688 0.70214689 0.85004744 0.72861209] mean value: 0.7382589583712581 key: train_mcc value: [0.78730353 0.83890836 0.80886044 0.82694799 0.76869281 0.81700175 0.81049737 0.78488114 0.82362364 0.77390646] mean value: 0.8040623492161448 key: test_accuracy value: [0.83458647 0.82706767 0.87878788 0.87878788 0.90151515 0.85606061 0.86363636 0.84848485 0.92424242 0.86363636] mean value: 0.867680565048986 key: train_accuracy value: [0.89318755 0.91925988 0.90420168 0.91344538 0.88403361 0.90840336 0.90504202 0.89159664 0.91176471 0.88655462] mean value: 0.9017489451625899 key: test_fscore value: [0.83333333 0.82170543 0.875 0.88732394 0.90780142 0.85925926 0.86567164 0.85714286 0.92647059 0.86764706] mean value: 0.8701355527043595 key: train_fscore value: [0.89581624 0.92039801 0.90578512 0.91395155 0.88632619 0.90939318 0.90653433 0.89503662 0.91242702 0.88907149] mean value: 0.9034739751181698 key: test_precision value: [0.83333333 0.85483871 0.90322581 0.82894737 0.85333333 0.84057971 0.85294118 0.81081081 0.9 0.84285714] mean value: 0.8520867391500221 key: train_precision value: [0.875 0.90686275 0.89105691 0.90863787 0.86914378 0.89967105 0.89250814 0.86750789 0.90562914 0.86977492] mean value: 0.8885792450788471 key: test_recall value: [0.83333333 0.79104478 0.84848485 0.95454545 0.96969697 0.87878788 0.87878788 0.90909091 0.95454545 0.89393939] mean value: 0.8912256897331524 key: train_recall value: [0.91764706 0.93434343 0.9210084 0.91932773 0.90420168 0.91932773 0.9210084 0.92436975 0.91932773 0.9092437 ] mean value: 0.9189805619217384 key: test_roc_auc value: [0.83457711 0.82734057 0.87878788 0.87878788 0.90151515 0.85606061 0.86363636 0.84848485 0.92424242 0.86363636] mean value: 0.8677069199457259 key: train_roc_auc value: [0.89316696 0.91927256 0.90420168 0.91344538 0.88403361 0.90840336 0.90504202 0.89159664 0.91176471 0.88655462] mean value: 0.9017481538069773 key: test_jcc value: [0.71428571 0.69736842 0.77777778 0.79746835 0.83116883 0.75324675 0.76315789 0.75 0.8630137 0.76623377] mean value: 0.7713721211562833 key: train_jcc value: [0.81129272 0.85253456 0.82779456 0.84153846 0.79585799 0.83384146 0.8290469 0.81001473 0.83895706 0.80029586] mean value: 0.8241174295814014 MCC on Blind test: 0.68 Accuracy on Blind test: 0.87 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.03249121 0.03886437 0.01911259 0.01886559 0.01927328 0.01915526 0.01913929 0.01914072 0.01931453 0.0187757 ] mean value: 0.022413253784179688 key: score_time value: [0.03683639 0.03328943 0.01319766 0.01342869 0.01344657 0.01330543 0.0137465 0.01340461 0.01348186 0.01355147] mean value: 0.01776885986328125 key: test_mcc value: [0.39968644 0.44252459 0.3955774 0.41026992 0.46975089 0.50999094 0.51729353 0.42739376 0.5000574 0.55182541] mean value: 0.46243702683016985 key: train_mcc value: [0.49973658 0.51693179 0.48338956 0.49447762 0.48289321 0.48000922 0.48867275 0.50527161 0.48144571 0.48144571] mean value: 0.491427375370929 key: test_accuracy value: [0.69924812 0.71428571 0.6969697 0.70454545 0.73484848 0.75 0.75757576 0.71212121 0.75 0.77272727] mean value: 0.7292321713374346 key: train_accuracy value: [0.74852817 0.7569386 0.74033613 0.74621849 0.74033613 0.73865546 0.74369748 0.7512605 0.7394958 0.7394958 ] mean value: 0.7444962577125047 key: test_fscore value: [0.68253968 0.6779661 0.68253968 0.69291339 0.73684211 0.72268908 0.74603175 0.69354839 0.7518797 0.75409836] mean value: 0.714104822652684 key: train_fscore value: [0.73516386 0.74265361 0.72582076 0.73415493 0.72727273 0.72404614 0.73408893 0.73758865 0.72566372 0.72566372] mean value: 0.7312117042117169 key: test_precision value: [0.71666667 0.78431373 0.71666667 0.72131148 0.73134328 0.81132075 0.78333333 0.74137931 0.74626866 0.82142857] mean value: 0.7574032444355586 key: train_precision value: [0.77715356 0.78827977 0.76879699 0.77079482 0.76579926 0.76691729 0.76268116 0.7804878 0.76635514 0.76635514] mean value: 0.7713620942500627 key: test_recall value: [0.65151515 0.59701493 0.65151515 0.66666667 0.74242424 0.65151515 0.71212121 0.65151515 0.75757576 0.6969697 ] mean value: 0.6778833107191315 key: train_recall value: [0.69747899 0.7020202 0.68739496 0.70084034 0.69243697 0.68571429 0.70756303 0.69915966 0.68907563 0.68907563] mean value: 0.6950759697818522 key: test_roc_auc value: [0.6988919 0.71517413 0.6969697 0.70454545 0.73484848 0.75 0.75757576 0.71212121 0.75 0.77272727] mean value: 0.7292853912256897 key: train_roc_auc value: [0.74857115 0.75689245 0.74033613 0.74621849 0.74033613 0.73865546 0.74369748 0.7512605 0.7394958 0.7394958 ] mean value: 0.7444959397900575 key: test_jcc value: [0.51807229 0.51282051 0.51807229 0.53012048 0.58333333 0.56578947 0.59493671 0.5308642 0.60240964 0.60526316] mean value: 0.5561682082919598 key: train_jcc value: [0.58123249 0.59065156 0.56963788 0.57997218 0.57142857 0.5674548 0.57988981 0.58426966 0.56944444 0.56944444] mean value: 0.5763425846399886 MCC on Blind test: 0.41 Accuracy on Blind test: 0.76 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.0336113 0.03431821 0.01922846 0.01908755 0.01887369 0.04641771 0.03898787 0.04656267 0.04592562 0.04554367] mean value: 0.03485567569732666 key: score_time value: [0.02585936 0.01357031 0.0133636 0.01322436 0.0131259 0.02207923 0.02071619 0.02628255 0.02265048 0.02270865] mean value: 0.019358062744140626 key: test_mcc value: [0.5339213 0.62602155 0.37987955 0.57602211 0.58003439 0.5768179 0.66666667 0.59152048 0.66789441 0.63753558] mean value: 0.5836313947320002 key: train_mcc value: [0.62153995 0.63685301 0.61201456 0.59876685 0.59876685 0.61017022 0.63706477 0.62017157 0.61512692 0.57557164] mean value: 0.6126046335536118 key: test_accuracy value: [0.76691729 0.81203008 0.68939394 0.78787879 0.78787879 0.78787879 0.83333333 0.79545455 0.83333333 0.81818182] mean value: 0.7912280701754386 key: train_accuracy value: [0.81076535 0.81833474 0.80588235 0.79915966 0.79915966 0.80504202 0.81848739 0.81008403 0.80756303 0.78739496] mean value: 0.8061873193347987 key: test_fscore value: [0.76691729 0.80620155 0.67716535 0.78461538 0.8 0.78125 0.83333333 0.8 0.83823529 0.8125 ] mean value: 0.7900218210017753 key: train_fscore value: [0.8104465 0.8202995 0.80306905 0.79520137 0.79520137 0.80338983 0.82 0.80976431 0.80740118 0.78170837] mean value: 0.8046481487421849 key: test_precision value: [0.76119403 0.83870968 0.70491803 0.796875 0.75675676 0.80645161 0.83333333 0.7826087 0.81428571 0.83870968] mean value: 0.7933842530407546 key: train_precision value: [0.8125 0.81085526 0.81487889 0.81118881 0.81118881 0.81025641 0.81322314 0.81112985 0.80808081 0.80319149] mean value: 0.8106493474693212 key: test_recall value: [0.77272727 0.7761194 0.65151515 0.77272727 0.84848485 0.75757576 0.83333333 0.81818182 0.86363636 0.78787879] mean value: 0.7882180009045681 key: train_recall value: [0.80840336 0.82996633 0.79159664 0.77983193 0.77983193 0.79663866 0.82689076 0.80840336 0.80672269 0.76134454] mean value: 0.7989630195512548 key: test_roc_auc value: [0.76696065 0.81230213 0.68939394 0.78787879 0.78787879 0.78787879 0.83333333 0.79545455 0.83333333 0.81818182] mean value: 0.7912596110357304 key: train_roc_auc value: [0.81076734 0.81834451 0.80588235 0.79915966 0.79915966 0.80504202 0.81848739 0.81008403 0.80756303 0.78739496] mean value: 0.8061884956002603 key: test_jcc value: [0.62195122 0.67532468 0.51190476 0.64556962 0.66666667 0.64102564 0.71428571 0.66666667 0.72151899 0.68421053] mean value: 0.6549124479297047 key: train_jcc value: [0.68130312 0.69534556 0.67094017 0.66002845 0.66002845 0.6713881 0.69491525 0.68033946 0.67700987 0.64164306] mean value: 0.673294149450316 MCC on Blind test: 0.52 Accuracy on Blind test: 0.81 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.02682781 0.01738381 0.01693344 0.0169661 0.01902151 0.03203535 0.01671958 0.03684998 0.03678107 0.02850962] mean value: 0.02480282783508301 key: score_time value: [0.05919313 0.04242587 0.04569626 0.04686856 0.07781005 0.05949831 0.05094695 0.0685482 0.0651741 0.06896615] mean value: 0.058512759208679196 key: test_mcc value: [0.64039914 0.61310775 0.48484848 0.62017367 0.56855832 0.66943868 0.65765844 0.64715023 0.5934603 0.60858062] mean value: 0.6103375626732229 key: train_mcc value: [0.74859568 0.7179162 0.73835124 0.7226 0.7328732 0.74237715 0.7368131 0.73393856 0.74262079 0.7275125 ] mean value: 0.7343598397022401 key: test_accuracy value: [0.81954887 0.80451128 0.74242424 0.8030303 0.78030303 0.83333333 0.82575758 0.81818182 0.78787879 0.8030303 ] mean value: 0.8017999544315334 key: train_accuracy value: [0.87132044 0.85534062 0.86638655 0.85714286 0.86302521 0.86806723 0.86554622 0.86302521 0.86722689 0.85966387] mean value: 0.8636745093327491 key: test_fscore value: [0.82352941 0.81690141 0.74242424 0.82191781 0.7972028 0.84057971 0.83687943 0.83333333 0.81081081 0.8115942 ] mean value: 0.8135173157873363 key: train_fscore value: [0.87905138 0.86477987 0.87410926 0.8671875 0.87175452 0.87608524 0.87341772 0.87235709 0.87636933 0.86942924] mean value: 0.8724541163103992 key: test_precision value: [0.8 0.77333333 0.74242424 0.75 0.74025974 0.80555556 0.78666667 0.76923077 0.73170732 0.77777778] mean value: 0.7676955402321256 key: train_precision value: [0.82985075 0.81120944 0.82634731 0.81021898 0.81952663 0.82589286 0.82511211 0.81671554 0.81991215 0.8128655 ] mean value: 0.819765125314062 key: test_recall value: [0.84848485 0.86567164 0.74242424 0.90909091 0.86363636 0.87878788 0.89393939 0.90909091 0.90909091 0.84848485] mean value: 0.8668701944821348 key: train_recall value: [0.93445378 0.92592593 0.92773109 0.93277311 0.93109244 0.93277311 0.92773109 0.93613445 0.94117647 0.93445378] mean value: 0.9324245253657019 key: test_roc_auc value: [0.81976481 0.80404794 0.74242424 0.8030303 0.78030303 0.83333333 0.82575758 0.81818182 0.78787879 0.8030303 ] mean value: 0.8017752148349164 key: train_roc_auc value: [0.87126729 0.85539994 0.86638655 0.85714286 0.86302521 0.86806723 0.86554622 0.86302521 0.86722689 0.85966387] mean value: 0.8636751266163031 key: test_jcc value: [0.7 0.69047619 0.59036145 0.69767442 0.6627907 0.725 0.7195122 0.71428571 0.68181818 0.68292683] mean value: 0.6864845673032532 key: train_jcc value: [0.7842031 0.76177285 0.77637131 0.76551724 0.77266388 0.77949438 0.7752809 0.77361111 0.77994429 0.76901798] mean value: 0.7737877045149908 MCC on Blind test: 0.45 Accuracy on Blind test: 0.76 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.08626294 0.06776714 0.06693482 0.08487916 0.08456087 0.08712363 0.07345867 0.06638074 0.08891439 0.13808131] mean value: 0.08443636894226074 key: score_time value: [0.02497745 0.02349496 0.02379441 0.04615092 0.02772403 0.02862549 0.02358913 0.02368736 0.03861403 0.04184937] mean value: 0.03025071620941162 key: test_mcc value: [0.65616669 0.67021931 0.65339283 0.71319972 0.75295561 0.69825325 0.75295561 0.63002408 0.75295561 0.70511024] mean value: 0.698523295201196 key: train_mcc value: [0.72978573 0.74735252 0.73100739 0.73527645 0.72393206 0.7254279 0.73067537 0.74376299 0.72326251 0.73466098] mean value: 0.732514389766043 key: test_accuracy value: [0.82706767 0.83458647 0.82575758 0.84848485 0.87121212 0.84848485 0.87121212 0.81060606 0.87121212 0.84848485] mean value: 0.8457108680792891 key: train_accuracy value: [0.86206897 0.86963835 0.86218487 0.86470588 0.85966387 0.85966387 0.86218487 0.86806723 0.85882353 0.86470588] mean value: 0.8631707317073171 key: test_fscore value: [0.83211679 0.84057971 0.83211679 0.8630137 0.88111888 0.85294118 0.88111888 0.82517483 0.88111888 0.85915493] mean value: 0.8548454559996922 key: train_fscore value: [0.87025316 0.87843137 0.87086614 0.87272727 0.86714399 0.86819258 0.87066246 0.87686275 0.86708861 0.87232355] mean value: 0.8714551892097665 key: test_precision value: [0.8028169 0.81690141 0.8028169 0.7875 0.81818182 0.82857143 0.81818182 0.76623377 0.81818182 0.80263158] mean value: 0.8062017439565624 key: train_precision value: [0.82212257 0.82232012 0.81925926 0.8238806 0.82326284 0.81845238 0.82020802 0.82205882 0.81913303 0.82582583] mean value: 0.8216523473090571 key: test_recall value: [0.86363636 0.86567164 0.86363636 0.95454545 0.95454545 0.87878788 0.95454545 0.89393939 0.95454545 0.92424242] mean value: 0.9108095884215287 key: train_recall value: [0.92436975 0.94276094 0.92941176 0.92773109 0.91596639 0.92436975 0.92773109 0.9394958 0.9210084 0.92436975] mean value: 0.9277214724273548 key: test_roc_auc value: [0.82734057 0.83435097 0.82575758 0.84848485 0.87121212 0.84848485 0.87121212 0.81060606 0.87121212 0.84848485] mean value: 0.8457146087743103 key: train_roc_auc value: [0.86201652 0.8696998 0.86218487 0.86470588 0.85966387 0.85966387 0.86218487 0.86806723 0.85882353 0.86470588] mean value: 0.8631716322892794 key: test_jcc value: [0.7125 0.725 0.7125 0.75903614 0.7875 0.74358974 0.7875 0.70238095 0.7875 0.75308642] mean value: 0.7470593260302095 key: train_jcc value: [0.77030812 0.78321678 0.77126918 0.77419355 0.76544944 0.76708508 0.77094972 0.78072626 0.76536313 0.77355837] mean value: 0.772211962153118 MCC on Blind test: 0.66 Accuracy on Blind test: 0.86 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [7.32249093 5.55219841 8.19922113 6.65408945 1.15874195 8.76371288 5.59072351 5.6803503 5.23264861 5.51218748] mean value: 5.96663646697998 key: score_time value: [0.02065516 0.02101636 0.02099442 0.02098346 0.02147007 0.02328658 0.0287044 0.01435661 0.01431417 0.02102661] mean value: 0.020680785179138184 key: test_mcc value: [0.80538602 0.78986657 0.89404202 0.82773811 0.69728992 0.83806027 0.87177979 0.88531564 0.81060226 0.83806027] mean value: 0.8258140871173387 key: train_mcc value: [0.97814523 0.92478051 0.95837445 0.95185354 0.74159554 0.98319883 0.97324671 0.99160924 0.95858042 0.97816369] mean value: 0.9439548161998859 key: test_accuracy value: [0.90225564 0.89473684 0.9469697 0.90909091 0.84848485 0.91666667 0.93181818 0.93939394 0.90151515 0.91666667] mean value: 0.9107598541809068 key: train_accuracy value: [0.98906644 0.96215307 0.9789916 0.97563025 0.87058824 0.99159664 0.98655462 0.99579832 0.9789916 0.98907563] mean value: 0.9718446402951424 key: test_fscore value: [0.9037037 0.89393939 0.94736842 0.91549296 0.85074627 0.92086331 0.93617021 0.94285714 0.90780142 0.92086331] mean value: 0.9139806137866777 key: train_fscore value: [0.9891031 0.96271748 0.97928749 0.97605285 0.86837607 0.99161074 0.98666667 0.99580889 0.9793559 0.98904802] mean value: 0.971802720420434 key: test_precision value: [0.88405797 0.90769231 0.94029851 0.85526316 0.83823529 0.87671233 0.88 0.89189189 0.85333333 0.87671233] mean value: 0.8804197120941343 key: train_precision value: [0.98662207 0.94779772 0.96568627 0.95941558 0.88347826 0.98994975 0.9785124 0.99331104 0.96266234 0.99155405] mean value: 0.9658989483467253 key: test_recall value: [0.92424242 0.88059701 0.95454545 0.98484848 0.86363636 0.96969697 1. 1. 0.96969697 0.96969697] mean value: 0.9516960651289009 key: train_recall value: [0.99159664 0.97811448 0.99327731 0.99327731 0.85378151 0.99327731 0.99495798 0.99831933 0.99663866 0.98655462] mean value: 0.9779795150383386 key: test_roc_auc value: [0.90241972 0.89484396 0.9469697 0.90909091 0.84848485 0.91666667 0.93181818 0.93939394 0.90151515 0.91666667] mean value: 0.91078697421981 key: train_roc_auc value: [0.98906431 0.96216648 0.9789916 0.97563025 0.87058824 0.99159664 0.98655462 0.99579832 0.9789916 0.98907563] mean value: 0.9718457686104744 key: test_jcc value: [0.82432432 0.80821918 0.9 0.84415584 0.74025974 0.85333333 0.88 0.89189189 0.83116883 0.85333333] mean value: 0.8426686476549491 key: train_jcc value: [0.97844113 0.92811502 0.95941558 0.95322581 0.7673716 0.98336106 0.97368421 0.99165275 0.95954693 0.97833333] mean value: 0.9473147424653781 MCC on Blind test: 0.66 Accuracy on Blind test: 0.87 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.06732917 0.04838347 0.05075574 0.05332375 0.0462842 0.06115699 0.05194139 0.05070543 0.05754256 0.05003095] mean value: 0.053745365142822264 key: score_time value: [0.0100081 0.00959229 0.00984335 0.00956178 0.00954437 0.00966573 0.00966644 0.00967693 0.00957775 0.01000595] mean value: 0.009714269638061523 key: test_mcc value: [0.94028503 0.89560771 0.91287093 0.95553309 0.85839508 0.89901011 0.88040627 0.9701425 0.87177979 0.93939394] mean value: 0.9123424440936554 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.96992481 0.94736842 0.95454545 0.97727273 0.92424242 0.9469697 0.93939394 0.98484848 0.93181818 0.96969697] mean value: 0.9546081111870586 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.97014925 0.94890511 0.95652174 0.97777778 0.92957746 0.94964029 0.94117647 0.98507463 0.93617021 0.96969697] mean value: 0.9564689912603958 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.95588235 0.92857143 0.91666667 0.95652174 0.86842105 0.90410959 0.91428571 0.97058824 0.88 0.96969697] mean value: 0.9264743748259183 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.97014925 1. 1. 1. 1. 0.96969697 1. 1. 0.96969697] mean value: 0.9894391677973767 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.97003618 0.94719584 0.95454545 0.97727273 0.92424242 0.9469697 0.93939394 0.98484848 0.93181818 0.96969697] mean value: 0.9546019900497512 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.94202899 0.90277778 0.91666667 0.95652174 0.86842105 0.90410959 0.88888889 0.97058824 0.88 0.94117647] mean value: 0.9171179405526042 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.89 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.20921516 0.21479702 0.22188973 0.2206893 0.22368717 0.22040296 0.19971609 0.21791267 0.21999526 0.21728563] mean value: 0.216559100151062 key: score_time value: [0.02080321 0.0215764 0.01975822 0.02135944 0.02216983 0.02127004 0.0214529 0.02133965 0.02135205 0.02098799] mean value: 0.021206974983215332 key: test_mcc value: [0.86718264 0.89484396 0.90909091 0.91287093 0.92690611 0.83806027 0.89901011 0.92690611 0.9251987 0.90909091] mean value: 0.9009160650602038 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.93233083 0.94736842 0.95454545 0.95454545 0.96212121 0.91666667 0.9469697 0.96212121 0.96212121 0.95454545] mean value: 0.9493335611756665 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.93430657 0.94736842 0.95454545 0.95652174 0.96350365 0.92086331 0.94964029 0.96350365 0.96296296 0.95454545] mean value: 0.9507761497972379 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.90140845 0.95454545 0.95454545 0.91666667 0.92957746 0.87671233 0.90410959 0.92957746 0.94202899 0.95454545] mean value: 0.9263717313900186 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96969697 0.94029851 0.95454545 1. 1. 0.96969697 1. 1. 0.98484848 0.95454545] mean value: 0.977363184079602 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.93260968 0.94742198 0.95454545 0.95454545 0.96212121 0.91666667 0.9469697 0.96212121 0.96212121 0.95454545] mean value: 0.949366802351877 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.87671233 0.9 0.91304348 0.91666667 0.92957746 0.85333333 0.90410959 0.92957746 0.92857143 0.91304348] mean value: 0.9064635232478852 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.49 Accuracy on Blind test: 0.81 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01371717 0.01392555 0.01349926 0.01353741 0.01367569 0.01371884 0.01378632 0.01376367 0.01417041 0.01406479] mean value: 0.013785910606384278 key: score_time value: [0.00944543 0.01027775 0.00953388 0.00994778 0.00947404 0.00971413 0.00948572 0.00943255 0.00957513 0.0105927 ] mean value: 0.009747910499572753 key: test_mcc value: [0.71569714 0.8253812 0.82158384 0.86853519 0.8824419 0.82425939 0.83205029 0.92690611 0.81442137 0.86612538] mean value: 0.8377401802212678 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.84962406 0.90977444 0.90909091 0.93181818 0.93939394 0.90909091 0.90909091 0.96212121 0.90151515 0.93181818] mean value: 0.9153337890179996 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8630137 0.91549296 0.91304348 0.9352518 0.94202899 0.91428571 0.91666667 0.96350365 0.90909091 0.93430657] mean value: 0.9206684427727275 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.7875 0.86666667 0.875 0.89041096 0.90277778 0.86486486 0.84615385 0.92957746 0.84415584 0.90140845] mean value: 0.8708515874016067 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.95454545 0.97014925 0.95454545 0.98484848 0.98484848 0.96969697 1. 1. 0.98484848 0.96969697] mean value: 0.9773179556761646 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85040706 0.90931705 0.90909091 0.93181818 0.93939394 0.90909091 0.90909091 0.96212121 0.90151515 0.93181818] mean value: 0.9153663500678426 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.75903614 0.84415584 0.84 0.87837838 0.89041096 0.84210526 0.84615385 0.92957746 0.83333333 0.87671233] mean value: 0.8539863562217576 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.38 Accuracy on Blind test: 0.76 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [3.85031939 3.69451547 3.61622024 3.522048 3.53350186 3.76046658 3.82019472 3.82504964 3.79067063 3.809582 ] mean value: 3.7222568511962892 key: score_time value: [0.11130786 0.16276884 0.1515305 0.1129837 0.11014009 0.1188767 0.11364388 0.11909842 0.1191082 0.11907864] mean value: 0.12385368347167969 key: test_mcc value: [0.89732778 0.94028503 0.9251987 0.92690611 0.89901011 0.89651574 0.92690611 0.94112395 0.86853519 0.9251987 ] mean value: 0.9147007416365505 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.96992481 0.96212121 0.96212121 0.9469697 0.9469697 0.96212121 0.96969697 0.93181818 0.96212121] mean value: 0.95612326270221 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94890511 0.96969697 0.96296296 0.96350365 0.94964029 0.94890511 0.96350365 0.97058824 0.9352518 0.96296296] mean value: 0.9575920735496123 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.91549296 0.98461538 0.94202899 0.92957746 0.90410959 0.91549296 0.92957746 0.94285714 0.89041096 0.94202899] mean value: 0.9296191891502648 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.95522388 0.98484848 1. 1. 0.98484848 1. 1. 0.98484848 0.98484848] mean value: 0.9879466304839439 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94764812 0.97003618 0.96212121 0.96212121 0.9469697 0.9469697 0.96212121 0.96969697 0.93181818 0.96212121] mean value: 0.9561623699683401 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.90277778 0.94117647 0.92857143 0.92957746 0.90410959 0.90277778 0.92957746 0.94285714 0.87837838 0.92857143] mean value: 0.918837492314073 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.74 Accuracy on Blind test: 0.9 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0...05', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.25330877 2.00703502 2.64860749 1.26994395 1.31967926 1.25739765 1.30199599 2.39054012 3.48491287 3.10396361] mean value: 2.0037384748458864 key: score_time value: [0.20351052 0.23993278 0.24853969 0.14206243 0.18108511 0.20508718 0.20811629 0.25352764 0.27486992 0.16690373] mean value: 0.21236352920532225 key: test_mcc value: [0.85319305 0.89484396 0.89404202 0.92690611 0.91287093 0.86452993 0.88040627 0.9251987 0.85478752 0.90950859] mean value: 0.8916287091253218 key: train_mcc value: [0.95517845 0.95193369 0.94656078 0.95185354 0.94537267 0.95164901 0.93839978 0.94873699 0.95347998 0.94524429] mean value: 0.9488409181794549 key: test_accuracy value: [0.92481203 0.94736842 0.9469697 0.96212121 0.95454545 0.93181818 0.93939394 0.96212121 0.92424242 0.95454545] mean value: 0.9447938026885395 key: train_accuracy value: [0.97729184 0.97560976 0.97310924 0.97563025 0.97226891 0.97563025 0.96890756 0.97394958 0.97647059 0.97226891] mean value: 0.9741136892099144 key: test_fscore value: [0.92753623 0.94736842 0.94736842 0.96350365 0.95652174 0.93333333 0.94117647 0.96296296 0.92857143 0.95522388] mean value: 0.9463566538807767 key: train_fscore value: [0.97770438 0.97605285 0.973466 0.97605285 0.97283951 0.97597349 0.96944674 0.9744856 0.9768595 0.97279472] mean value: 0.974567563469322 key: test_precision value: [0.88888889 0.95454545 0.94029851 0.92957746 0.91666667 0.91304348 0.91428571 0.94202899 0.87837838 0.94117647] mean value: 0.9218890009372873 key: train_precision value: [0.96103896 0.95786062 0.96072013 0.95941558 0.95322581 0.9624183 0.95292208 0.95483871 0.96097561 0.95469256] mean value: 0.9578108353365855 key: test_recall value: [0.96969697 0.94029851 0.95454545 1. 1. 0.95454545 0.96969697 0.98484848 0.98484848 0.96969697] mean value: 0.9728177295341475 key: train_recall value: [0.99495798 0.99494949 0.98655462 0.99327731 0.99327731 0.98991597 0.98655462 0.99495798 0.99327731 0.99159664] mean value: 0.9919319242848654 key: test_roc_auc value: [0.92514699 0.94742198 0.9469697 0.96212121 0.95454545 0.93181818 0.93939394 0.96212121 0.92424242 0.95454545] mean value: 0.9448326549072817 key: train_roc_auc value: [0.97727697 0.97562601 0.97310924 0.97563025 0.97226891 0.97563025 0.96890756 0.97394958 0.97647059 0.97226891] mean value: 0.9741138273491214 key: test_jcc value: [0.86486486 0.9 0.9 0.92957746 0.91666667 0.875 0.88888889 0.92857143 0.86666667 0.91428571] mean value: 0.8984521694732962 key: train_jcc value: [0.95638126 0.95322581 0.94830372 0.95322581 0.94711538 0.95307443 0.94070513 0.95024077 0.95476575 0.9470305 ] mean value: 0.950406855441748 MCC on Blind test: 0.79 Accuracy on Blind test: 0.92 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.03859448 0.04442239 0.0445168 0.04431796 0.03685474 0.04462409 0.04459906 0.04071379 0.04456615 0.04441738] mean value: 0.04276268482208252 key: score_time value: [0.02326918 0.02230549 0.0216465 0.02467608 0.02369523 0.02307796 0.02220964 0.02150774 0.02098942 0.02424097] mean value: 0.022761821746826172 key: test_mcc value: [0.5339213 0.62602155 0.37987955 0.57602211 0.58003439 0.5768179 0.66666667 0.59152048 0.66789441 0.63753558] mean value: 0.5836313947320002 key: train_mcc value: [0.62153995 0.63685301 0.61201456 0.59876685 0.59876685 0.61017022 0.63706477 0.62017157 0.61512692 0.57557164] mean value: 0.6126046335536118 key: test_accuracy value: [0.76691729 0.81203008 0.68939394 0.78787879 0.78787879 0.78787879 0.83333333 0.79545455 0.83333333 0.81818182] mean value: 0.7912280701754386 key: train_accuracy value: [0.81076535 0.81833474 0.80588235 0.79915966 0.79915966 0.80504202 0.81848739 0.81008403 0.80756303 0.78739496] mean value: 0.8061873193347987 key: test_fscore value: [0.76691729 0.80620155 0.67716535 0.78461538 0.8 0.78125 0.83333333 0.8 0.83823529 0.8125 ] mean value: 0.7900218210017753 key: train_fscore value: [0.8104465 0.8202995 0.80306905 0.79520137 0.79520137 0.80338983 0.82 0.80976431 0.80740118 0.78170837] mean value: 0.8046481487421849 key: test_precision value: [0.76119403 0.83870968 0.70491803 0.796875 0.75675676 0.80645161 0.83333333 0.7826087 0.81428571 0.83870968] mean value: 0.7933842530407546 key: train_precision value: [0.8125 0.81085526 0.81487889 0.81118881 0.81118881 0.81025641 0.81322314 0.81112985 0.80808081 0.80319149] mean value: 0.8106493474693212 key: test_recall value: [0.77272727 0.7761194 0.65151515 0.77272727 0.84848485 0.75757576 0.83333333 0.81818182 0.86363636 0.78787879] mean value: 0.7882180009045681 key: train_recall value: [0.80840336 0.82996633 0.79159664 0.77983193 0.77983193 0.79663866 0.82689076 0.80840336 0.80672269 0.76134454] mean value: 0.7989630195512548 key: test_roc_auc value: [0.76696065 0.81230213 0.68939394 0.78787879 0.78787879 0.78787879 0.83333333 0.79545455 0.83333333 0.81818182] mean value: 0.7912596110357304 key: train_roc_auc value: [0.81076734 0.81834451 0.80588235 0.79915966 0.79915966 0.80504202 0.81848739 0.81008403 0.80756303 0.78739496] mean value: 0.8061884956002603 key: test_jcc value: [0.62195122 0.67532468 0.51190476 0.64556962 0.66666667 0.64102564 0.71428571 0.66666667 0.72151899 0.68421053] mean value: 0.6549124479297047 key: train_jcc value: [0.68130312 0.69534556 0.67094017 0.66002845 0.66002845 0.6713881 0.69491525 0.68033946 0.67700987 0.64164306] mean value: 0.673294149450316 MCC on Blind test: 0.52 Accuracy on Blind test: 0.81 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC0... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [9.06408787 7.39492488 7.35578918 2.32716751 8.98293686 8.69696665 8.82733345 8.63794541 8.75473666 8.13571477] mean value: 7.817760324478149 key: score_time value: [0.02943587 0.03138995 0.01382542 0.01349759 0.02445674 0.03002214 0.02363658 0.02672243 0.02811027 0.02797961] mean value: 0.024907660484313966 key: test_mcc value: [0.89732778 0.93984622 0.98496155 0.95553309 0.91287093 0.91287093 0.91287093 0.94112395 0.86853519 0.89486432] mean value: 0.9220804884879608 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.94736842 0.96992481 0.99242424 0.97727273 0.95454545 0.95454545 0.95454545 0.96969697 0.93181818 0.9469697 ] mean value: 0.9599111414900888 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94890511 0.97014925 0.9924812 0.97777778 0.95652174 0.95652174 0.95652174 0.97058824 0.9352518 0.94814815] mean value: 0.9612866743400412 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.91549296 0.97014925 0.98507463 0.95652174 0.91666667 0.91666667 0.91666667 0.94285714 0.89041096 0.92753623] mean value: 0.9338042911119239 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.98484848 0.97014925 1. 1. 1. 1. 1. 1. 0.98484848 0.96969697] mean value: 0.9909543193125283 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.94764812 0.96992311 0.99242424 0.97727273 0.95454545 0.95454545 0.95454545 0.96969697 0.93181818 0.9469697 ] mean value: 0.9599389416553595 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.90277778 0.94202899 0.98507463 0.95652174 0.91666667 0.91666667 0.91666667 0.94285714 0.87837838 0.90140845] mean value: 0.9259047101220877 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.76 Accuracy on Blind test: 0.91 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.07274532 0.08438516 0.08188748 0.10827613 0.10006595 0.09745407 0.07904243 0.09105849 0.09434795 0.0698843 ] mean value: 0.08791472911834716 key: score_time value: [0.02005672 0.01286626 0.0128572 0.02031779 0.02774405 0.01311469 0.01296258 0.02588487 0.01283789 0.01285267] mean value: 0.017149472236633302 key: test_mcc value: [0.65489945 0.68499676 0.71220297 0.73576721 0.78086881 0.68378319 0.72760688 0.67161876 0.74456392 0.65765844] mean value: 0.7053966384637252 key: train_mcc value: [0.82394775 0.80531684 0.80123327 0.79450021 0.79161831 0.79274396 0.78937774 0.78455181 0.79802051 0.79422122] mean value: 0.7975531631116414 key: test_accuracy value: [0.82706767 0.84210526 0.85606061 0.86363636 0.87878788 0.84090909 0.86363636 0.83333333 0.87121212 0.82575758] mean value: 0.850250626566416 key: train_accuracy value: [0.91084945 0.90159798 0.9 0.89663866 0.89495798 0.89579832 0.89411765 0.89159664 0.89831933 0.89663866] mean value: 0.8980514661709933 key: test_fscore value: [0.82962963 0.83969466 0.85714286 0.87323944 0.89189189 0.84671533 0.86567164 0.84285714 0.87591241 0.83687943] mean value: 0.8559634426271225 key: train_fscore value: [0.91410049 0.90495532 0.90269828 0.89942764 0.89829129 0.898527 0.89689034 0.89469388 0.90122449 0.89909762] mean value: 0.9009906357661513 key: test_precision value: [0.8115942 0.859375 0.85074627 0.81578947 0.80487805 0.81690141 0.85294118 0.7972973 0.84507042 0.78666667] mean value: 0.8241259965440433 key: train_precision value: [0.88262911 0.8744113 0.87898089 0.87579618 0.87066246 0.87559809 0.87400319 0.86984127 0.87619048 0.87820513] mean value: 0.875631809174941 key: test_recall value: [0.84848485 0.82089552 0.86363636 0.93939394 1. 0.87878788 0.87878788 0.89393939 0.90909091 0.89393939] mean value: 0.8926956128448665 key: train_recall value: [0.94789916 0.93771044 0.92773109 0.92436975 0.92773109 0.92268908 0.9210084 0.9210084 0.92773109 0.9210084 ] mean value: 0.9278886908298674 key: test_roc_auc value: [0.8272275 0.84226594 0.85606061 0.86363636 0.87878788 0.84090909 0.86363636 0.83333333 0.87121212 0.82575758] mean value: 0.8502826775214836 key: train_roc_auc value: [0.91081827 0.90162833 0.9 0.89663866 0.89495798 0.89579832 0.89411765 0.89159664 0.89831933 0.89663866] mean value: 0.8980513821690292 key: test_jcc value: [0.70886076 0.72368421 0.75 0.775 0.80487805 0.73417722 0.76315789 0.72839506 0.77922078 0.7195122 ] mean value: 0.7486886164798315 key: train_jcc value: [0.84179104 0.8264095 0.82265276 0.81723626 0.81536189 0.81575037 0.81305638 0.80945347 0.82020802 0.81669151] mean value: 0.8198611195150052 MCC on Blind test: 0.63 Accuracy on Blind test: 0.85 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01525617 0.01732731 0.01737404 0.01752138 0.01738 0.01749825 0.01730013 0.01765752 0.01737523 0.01737618] mean value: 0.017206621170043946 key: score_time value: [0.01277661 0.01270509 0.01268816 0.01266551 0.01260066 0.01256776 0.01265335 0.01260543 0.01266003 0.01263642] mean value: 0.012655901908874511 key: test_mcc value: [0.56389958 0.58604879 0.36430604 0.51610023 0.5611861 0.563786 0.63665602 0.5153882 0.63900965 0.68378319] mean value: 0.5630163814123827 key: train_mcc value: [0.57686697 0.59485056 0.561523 0.55675563 0.56322889 0.58015978 0.55998031 0.58695598 0.57201782 0.55821108] mean value: 0.5710550001904885 key: test_accuracy value: [0.78195489 0.78947368 0.68181818 0.75757576 0.78030303 0.78030303 0.81818182 0.75757576 0.81818182 0.84090909] mean value: 0.7806277056277057 key: train_accuracy value: [0.78805719 0.79730866 0.78067227 0.77815126 0.78151261 0.78991597 0.77983193 0.79327731 0.78571429 0.7789916 ] mean value: 0.7853433080549292 key: test_fscore value: [0.77862595 0.77419355 0.671875 0.75 0.78518519 0.768 0.81538462 0.75384615 0.82608696 0.83464567] mean value: 0.7757843082814602 key: train_fscore value: [0.78275862 0.794193 0.77787234 0.77358491 0.77853492 0.78632479 0.77606838 0.78938356 0.78073947 0.77578858] mean value: 0.7815248554785705 key: test_precision value: [0.78461538 0.84210526 0.69354839 0.77419355 0.76811594 0.81355932 0.828125 0.765625 0.79166667 0.86885246] mean value: 0.7930406973003095 key: train_precision value: [0.80353982 0.80589255 0.78793103 0.78984238 0.78929188 0.8 0.78956522 0.80453752 0.79929577 0.78719723] mean value: 0.7957093415182501 key: test_recall value: [0.77272727 0.71641791 0.65151515 0.72727273 0.8030303 0.72727273 0.8030303 0.74242424 0.86363636 0.8030303 ] mean value: 0.7610357304387155 key: train_recall value: [0.76302521 0.78282828 0.76806723 0.75798319 0.76806723 0.77310924 0.76302521 0.77478992 0.76302521 0.76470588] mean value: 0.7678626602156013 key: test_roc_auc value: [0.78188602 0.79002714 0.68181818 0.75757576 0.78030303 0.78030303 0.81818182 0.75757576 0.81818182 0.84090909] mean value: 0.7806761646313886 key: train_roc_auc value: [0.78807826 0.79729649 0.78067227 0.77815126 0.78151261 0.78991597 0.77983193 0.79327731 0.78571429 0.7789916 ] mean value: 0.7853441982853748 key: test_jcc value: [0.6375 0.63157895 0.50588235 0.6 0.64634146 0.62337662 0.68831169 0.60493827 0.7037037 0.71621622] mean value: 0.6357849266937401 key: train_jcc value: [0.64305949 0.65864023 0.63649025 0.63076923 0.63737796 0.64788732 0.63407821 0.65205092 0.6403385 0.63370474] mean value: 0.6414396857841679 MCC on Blind test: 0.53 Accuracy on Blind test: 0.81 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.04524255 0.03324056 0.04321265 0.0383203 0.03985834 0.04182839 0.03704095 0.03285527 0.06639671 0.03593302] mean value: 0.0413928747177124 key: score_time value: [0.01156163 0.01269126 0.01272488 0.01269293 0.01273298 0.01271439 0.01266551 0.01269531 0.02094913 0.01287222] mean value: 0.013430023193359375 key: test_mcc value: [0.68499676 0.59541983 0.71285802 0.44226898 0.55401326 0.5934603 0.73125738 0.47628967 0.74663552 0.62994079] mean value: 0.6167140502093844 key: train_mcc value: [0.76897638 0.71112018 0.75153227 0.55906289 0.53120474 0.69572774 0.71330134 0.49351182 0.75024237 0.67439964] mean value: 0.6649079378118509 key: test_accuracy value: [0.84210526 0.79699248 0.85606061 0.6969697 0.73484848 0.78787879 0.85606061 0.71212121 0.87121212 0.8030303 ] mean value: 0.7957279562542721 key: train_accuracy value: [0.88393608 0.84693019 0.87563025 0.74537815 0.7210084 0.83193277 0.84789916 0.70336134 0.87478992 0.82689076] mean value: 0.8157757030482504 key: test_fscore value: [0.84444444 0.8057554 0.85271318 0.60784314 0.79041916 0.81081081 0.8707483 0.62745098 0.864 0.77192982] mean value: 0.7846115232438119 key: train_fscore value: [0.88707038 0.86191199 0.87728027 0.66519337 0.78157895 0.85380117 0.86298259 0.58616647 0.872103 0.80268199] mean value: 0.805077017361187 key: test_precision value: [0.82608696 0.77777778 0.87301587 0.86111111 0.65346535 0.73170732 0.79012346 0.88888889 0.91525424 0.91666667] mean value: 0.823409763166814 key: train_precision value: [0.86443381 0.78453039 0.86579378 0.97096774 0.64216216 0.75549806 0.78512397 0.96899225 0.89122807 0.93318486] mean value: 0.8461915083249473 key: test_recall value: [0.86363636 0.8358209 0.83333333 0.46969697 1. 0.90909091 0.96969697 0.48484848 0.81818182 0.66666667] mean value: 0.7850972410673903 key: train_recall value: [0.91092437 0.95622896 0.88907563 0.50588235 0.99831933 0.98151261 0.95798319 0.42016807 0.85378151 0.70420168] mean value: 0.8178077695724755 key: test_roc_auc value: [0.84226594 0.79669833 0.85606061 0.6969697 0.73484848 0.78787879 0.85606061 0.71212121 0.87121212 0.8030303 ] mean value: 0.7957146087743103 key: train_roc_auc value: [0.88391336 0.84702204 0.87563025 0.74537815 0.7210084 0.83193277 0.84789916 0.70336134 0.87478992 0.82689076] mean value: 0.8157826160767337 key: test_jcc value: [0.73076923 0.6746988 0.74324324 0.43661972 0.65346535 0.68181818 0.77108434 0.45714286 0.76056338 0.62857143] mean value: 0.6537976519201265 key: train_jcc value: [0.79705882 0.75733333 0.78138848 0.49834437 0.64146868 0.74489796 0.75898802 0.4145937 0.77321157 0.6704 ] mean value: 0.6837684929881324 MCC on Blind test: 0.41 Accuracy on Blind test: 0.79 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.04435778 0.07317758 0.05246305 0.03744841 0.05032897 0.07695699 0.04261231 0.04731846 0.0365212 0.0385685 ] mean value: 0.04997532367706299 key: score_time value: [0.01271534 0.01193547 0.01296759 0.012784 0.02681756 0.01191354 0.01280212 0.01282024 0.02049041 0.01277399] mean value: 0.014802026748657226 key: test_mcc value: [0.51226807 0.62415197 0.69544219 0.73029674 0.47140452 0.63362511 0.62407827 0.55301004 0.45742763 0.5934603 ] mean value: 0.5895164834865797 key: train_mcc value: [0.61208229 0.80123118 0.71239844 0.7731977 0.48521622 0.72253313 0.63418446 0.60364035 0.53398461 0.63871402] mean value: 0.6517182402152969 key: test_accuracy value: [0.72932331 0.81203008 0.84090909 0.86363636 0.68181818 0.81060606 0.78030303 0.75757576 0.68939394 0.78787879] mean value: 0.7753474595579859 key: train_accuracy value: [0.78301093 0.8999159 0.84453782 0.88655462 0.69159664 0.85294118 0.78823529 0.7789916 0.73109244 0.8 ] mean value: 0.8056876409100225 key: test_fscore value: [0.64705882 0.81203008 0.85517241 0.86956522 0.75862069 0.78991597 0.81987578 0.7037037 0.56842105 0.75862069] mean value: 0.7582984408331488 key: train_fscore value: [0.73236515 0.90269828 0.86204325 0.88569009 0.76398714 0.83537159 0.82475661 0.72689512 0.64125561 0.75862069] mean value: 0.793368352154183 key: test_precision value: [0.91666667 0.81818182 0.78481013 0.83333333 0.61111111 0.88679245 0.69473684 0.9047619 0.93103448 0.88 ] mean value: 0.8261428738331185 key: train_precision value: [0.95663957 0.87758347 0.77479893 0.89249147 0.61875 0.94871795 0.70344009 0.95108696 0.96296296 0.95652174] mean value: 0.8642993129637412 key: test_recall value: [0.5 0.80597015 0.93939394 0.90909091 1. 0.71212121 1. 0.57575758 0.40909091 0.66666667] mean value: 0.7518091361374943 key: train_recall value: [0.59327731 0.92929293 0.97142857 0.8789916 0.99831933 0.74621849 0.99663866 0.58823529 0.48067227 0.62857143] mean value: 0.78116458704694 key: test_roc_auc value: [0.72761194 0.81207598 0.84090909 0.86363636 0.68181818 0.81060606 0.78030303 0.75757576 0.68939394 0.78787879] mean value: 0.7751809136137494 key: train_roc_auc value: [0.78317064 0.89994058 0.84453782 0.88655462 0.69159664 0.85294118 0.78823529 0.7789916 0.73109244 0.8 ] mean value: 0.8057060804119628 key: test_jcc value: [0.47826087 0.6835443 0.74698795 0.76923077 0.61111111 0.65277778 0.69473684 0.54285714 0.39705882 0.61111111] mean value: 0.6187676702892502 key: train_jcc value: [0.57774141 0.82265276 0.75753604 0.79483283 0.61810614 0.71728595 0.70177515 0.57096248 0.47194719 0.61111111] mean value: 0.6643951051173903 MCC on Blind test: 0.6 Accuracy on Blind test: 0.85 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.37129021 0.35578799 0.35703659 0.34674764 0.33202577 0.35261488 0.33393955 0.33623791 0.33627582 0.33462644] mean value: 0.345658278465271 key: score_time value: [0.01794934 0.01799059 0.01849675 0.01664996 0.01663804 0.01741576 0.01670289 0.0176537 0.017905 0.01648688] mean value: 0.01738889217376709 key: test_mcc value: [0.86718264 0.88011764 0.86373551 0.89651574 0.82773811 0.8196886 0.81855773 0.833429 0.85839508 0.833429 ] mean value: 0.8498789066655591 key: train_mcc value: [0.91988743 0.9180489 0.9179388 0.9179388 0.92357126 0.90120825 0.93138846 0.91138916 0.92810999 0.928425 ] mean value: 0.9197906059991515 key: test_accuracy value: [0.93233083 0.93984962 0.93181818 0.9469697 0.90909091 0.90909091 0.90909091 0.91666667 0.92424242 0.91666667] mean value: 0.9235816814764183 key: train_accuracy value: [0.95962994 0.9587889 0.95882353 0.95882353 0.96134454 0.95042017 0.96554622 0.95546218 0.96386555 0.96386555] mean value: 0.9596570099865009 key: test_fscore value: [0.93430657 0.93939394 0.93233083 0.94890511 0.91549296 0.91176471 0.91044776 0.91729323 0.92957746 0.91729323] mean value: 0.9256805801070733 key: train_fscore value: [0.96039604 0.95940348 0.9593361 0.9593361 0.96217105 0.95111848 0.9659751 0.95616212 0.96437448 0.9645507 ] mean value: 0.9602823650782725 key: test_precision value: [0.90140845 0.95384615 0.92537313 0.91549296 0.85526316 0.88571429 0.89705882 0.91044776 0.86842105 0.91044776] mean value: 0.9023473538783289 key: train_precision value: [0.94327391 0.94453507 0.94754098 0.94754098 0.94202899 0.9379085 0.95409836 0.94136808 0.95098039 0.94660194] mean value: 0.9455877201594677 key: test_recall value: [0.96969697 0.92537313 0.93939394 0.98484848 0.98484848 0.93939394 0.92424242 0.92424242 1. 0.92424242] mean value: 0.9516282225237449 key: train_recall value: [0.97815126 0.97474747 0.97142857 0.97142857 0.98319328 0.96470588 0.97815126 0.97142857 0.97815126 0.98319328] mean value: 0.9754579407520584 key: test_roc_auc value: [0.93260968 0.93995929 0.93181818 0.9469697 0.90909091 0.90909091 0.90909091 0.91666667 0.92424242 0.91666667] mean value: 0.9236205336951606 key: train_roc_auc value: [0.95961435 0.95880231 0.95882353 0.95882353 0.96134454 0.95042017 0.96554622 0.95546218 0.96386555 0.96386555] mean value: 0.9596567920097332 key: test_jcc value: [0.87671233 0.88571429 0.87323944 0.90277778 0.84415584 0.83783784 0.83561644 0.84722222 0.86842105 0.84722222] mean value: 0.8618919446304775 key: train_jcc value: [0.92380952 0.92197452 0.92185008 0.92185008 0.92709984 0.90679305 0.93418941 0.91600634 0.9312 0.93152866] mean value: 0.9236301503750806 MCC on Blind test: 0.8 Accuracy on Blind test: 0.92 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.12730455 0.2520771 0.24463916 0.23580003 0.25089359 0.23320365 0.24741626 0.24575734 0.2396307 0.13593245] mean value: 0.22126548290252684 key: score_time value: [0.03831649 0.03454685 0.03999615 0.03003001 0.03436136 0.04432249 0.03896332 0.03376579 0.03437304 0.03488183] mean value: 0.03635573387145996 key: test_mcc value: [0.85953823 0.89484396 0.91076511 0.92690611 0.85478752 0.85478752 0.94112395 0.92690611 0.87177979 0.89651574] mean value: 0.8937954025367763 key: train_mcc value: [0.99495514 0.99495514 0.99495939 0.99495939 0.99663866 0.99327731 0.99160924 0.99663866 0.99664429 0.99497063] mean value: 0.9949607836874935 key: test_accuracy value: [0.92481203 0.94736842 0.95454545 0.96212121 0.92424242 0.92424242 0.96969697 0.96212121 0.93181818 0.9469697 ] mean value: 0.9447938026885395 key: train_accuracy value: [0.99747687 0.99747687 0.99747899 0.99747899 0.99831933 0.99663866 0.99579832 0.99831933 0.99831933 0.99747899] mean value: 0.9974785675413984 key: test_fscore value: [0.92957746 0.94736842 0.95588235 0.96350365 0.92857143 0.92857143 0.97058824 0.96350365 0.93617021 0.94890511] mean value: 0.9472641952744596 key: train_fscore value: [0.99748111 0.99747262 0.99748111 0.99748111 0.99831933 0.99663866 0.99580889 0.99831933 0.99832215 0.99748533] mean value: 0.9974809619824477 key: test_precision value: [0.86842105 0.95454545 0.92857143 0.92957746 0.87837838 0.87837838 0.94285714 0.92957746 0.88 0.91549296] mean value: 0.9105799722686305 key: train_precision value: [0.9966443 0.99831366 0.9966443 0.9966443 0.99831933 0.99663866 0.99331104 0.99831933 0.99664992 0.99498328] mean value: 0.9966468086818777 key: test_recall value: [1. 0.94029851 0.98484848 1. 0.98484848 0.98484848 1. 1. 1. 0.98484848] mean value: 0.9879692446856626 key: train_recall value: [0.99831933 0.996633 0.99831933 0.99831933 0.99831933 0.99663866 0.99831933 0.99831933 1. 1. ] mean value: 0.9983187618481736 key: test_roc_auc value: [0.92537313 0.94742198 0.95454545 0.96212121 0.92424242 0.92424242 0.96969697 0.96212121 0.93181818 0.9469697 ] mean value: 0.9448552691090004 key: train_roc_auc value: [0.99747616 0.99747616 0.99747899 0.99747899 0.99831933 0.99663866 0.99579832 0.99831933 0.99831933 0.99747899] mean value: 0.9974784257137198 key: test_jcc value: [0.86842105 0.9 0.91549296 0.92957746 0.86666667 0.86666667 0.94285714 0.92957746 0.88 0.90277778] mean value: 0.9002037193923776 key: train_jcc value: [0.99497487 0.99495798 0.99497487 0.99497487 0.9966443 0.99329983 0.99165275 0.9966443 0.99664992 0.99498328] mean value: 0.9949756977839559 MCC on Blind test: 0.77 Accuracy on Blind test: 0.91 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.76625323 0.93301606 0.85965228 0.68952823 0.66378808 0.87883234 0.7050457 0.75222564 0.88421106 0.79904914] mean value: 0.7931601762771606 key: score_time value: [0.05368519 0.07283998 0.06703472 0.05444479 0.05356479 0.05195475 0.0593667 0.05376959 0.05021644 0.02867389] mean value: 0.054555082321166994 key: test_mcc value: [0.82301362 0.79255617 0.75792383 0.78368849 0.80758535 0.7800135 0.83806027 0.81442137 0.80123362 0.82425939] mean value: 0.8022755606617251 key: train_mcc value: [0.95680613 0.96343671 0.95828776 0.958472 0.95359324 0.96658315 0.96157408 0.96183506 0.96346619 0.96682907] mean value: 0.9610883387492701 key: test_accuracy value: [0.90977444 0.89473684 0.87878788 0.88636364 0.90151515 0.88636364 0.91666667 0.90151515 0.89393939 0.90909091] mean value: 0.8978753702437913 key: train_accuracy value: [0.97813288 0.98149706 0.9789916 0.9789916 0.97647059 0.98319328 0.98067227 0.98067227 0.98151261 0.98319328] mean value: 0.9803327420118594 key: test_fscore value: [0.91304348 0.9 0.88059701 0.8951049 0.90647482 0.89361702 0.92086331 0.90909091 0.90277778 0.91428571] mean value: 0.9035854940218537 key: train_fscore value: [0.9785124 0.98175788 0.97925311 0.97932175 0.97689769 0.98336106 0.98088113 0.98097601 0.98178808 0.98344371] mean value: 0.9806192826004415 key: test_precision value: [0.875 0.8630137 0.86764706 0.83116883 0.8630137 0.84 0.87671233 0.84415584 0.83333333 0.86486486] mean value: 0.85589096583738 key: train_precision value: [0.96260163 0.96732026 0.96721311 0.96416938 0.95948136 0.97364086 0.97039474 0.96579805 0.96737357 0.96900489] mean value: 0.9666997850416796 key: test_recall value: [0.95454545 0.94029851 0.89393939 0.96969697 0.95454545 0.95454545 0.96969697 0.98484848 0.98484848 0.96969697] mean value: 0.9576662143826323 key: train_recall value: [0.99495798 0.996633 0.99159664 0.99495798 0.99495798 0.99327731 0.99159664 0.99663866 0.99663866 0.99831933] mean value: 0.9949574173103585 key: test_roc_auc value: [0.91010855 0.89439168 0.87878788 0.88636364 0.90151515 0.88636364 0.91666667 0.90151515 0.89393939 0.90909091] mean value: 0.8978742650384441 key: train_roc_auc value: [0.97811872 0.98150978 0.9789916 0.9789916 0.97647059 0.98319328 0.98067227 0.98067227 0.98151261 0.98319328] mean value: 0.9803325976855389 key: test_jcc value: [0.84 0.81818182 0.78666667 0.81012658 0.82894737 0.80769231 0.85333333 0.83333333 0.82278481 0.84210526] mean value: 0.824317148319147 key: train_jcc value: [0.9579288 0.96416938 0.95934959 0.95948136 0.95483871 0.96726678 0.96247961 0.96266234 0.96422764 0.96742671] mean value: 0.9619830922592865 MCC on Blind test: 0.59 Accuracy on Blind test: 0.84 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [1.54636669 1.56205773 1.50813985 1.55183625 1.58534145 1.55194783 1.54392004 1.56743145 1.5406394 1.5221889 ] mean value: 1.5479869604110719 key: score_time value: [0.01014566 0.00983214 0.00978208 0.01077819 0.01102424 0.00981069 0.01092172 0.01045537 0.01038098 0.00968528] mean value: 0.010281634330749512 key: test_mcc value: [0.869585 0.89484396 0.92690611 0.91287093 0.88531564 0.89901011 0.91287093 0.8824419 0.88531564 0.91076511] mean value: 0.8979925323640788 key: train_mcc value: [0.97672119 0.9683493 0.97666924 0.97009871 0.97009871 0.97001645 0.97495654 0.97166052 0.97666924 0.97173741] mean value: 0.9726977310015105 key: test_accuracy value: [0.93233083 0.94736842 0.96212121 0.95454545 0.93939394 0.9469697 0.95454545 0.93939394 0.93939394 0.95454545] mean value: 0.9470608339029392 key: train_accuracy value: [0.9882254 0.98402019 0.98823529 0.98487395 0.98487395 0.98487395 0.98739496 0.98571429 0.98823529 0.98571429] mean value: 0.9862161550911366 key: test_fscore value: [0.9352518 0.94736842 0.96350365 0.95652174 0.94285714 0.94964029 0.95652174 0.94202899 0.94285714 0.95588235] mean value: 0.9492433259442181 key: train_fscore value: [0.98837209 0.98420615 0.98835275 0.98507463 0.98507463 0.98504983 0.98751041 0.98586866 0.98835275 0.98589212] mean value: 0.986375400863372 key: test_precision value: [0.89041096 0.95454545 0.92957746 0.91666667 0.89189189 0.90410959 0.91666667 0.90277778 0.89189189 0.92857143] mean value: 0.9127109790745715 key: train_precision value: [0.97701149 0.97208539 0.9785832 0.97217676 0.97217676 0.97372742 0.97854785 0.97532895 0.9785832 0.97377049] mean value: 0.9751991507005686 key: test_recall value: [0.98484848 0.94029851 1. 1. 1. 1. 1. 0.98484848 1. 0.98484848] mean value: 0.9894843962008141 key: train_recall value: [1. 0.996633 0.99831933 0.99831933 0.99831933 0.99663866 0.99663866 0.99663866 0.99831933 0.99831933] mean value: 0.9978145601675014 key: test_roc_auc value: [0.93272275 0.94742198 0.96212121 0.95454545 0.93939394 0.9469697 0.95454545 0.93939394 0.93939394 0.95454545] mean value: 0.9471053821800091 key: train_roc_auc value: [0.98821549 0.98403078 0.98823529 0.98487395 0.98487395 0.98487395 0.98739496 0.98571429 0.98823529 0.98571429] mean value: 0.9862162238632827 key: test_jcc value: [0.87837838 0.9 0.92957746 0.91666667 0.89189189 0.90410959 0.91666667 0.89041096 0.89189189 0.91549296] mean value: 0.9035086465975912 key: train_jcc value: [0.97701149 0.96890344 0.97697368 0.97058824 0.97058824 0.9705401 0.97532895 0.97213115 0.97697368 0.97217676] mean value: 0.9731215722770584 MCC on Blind test: 0.78 Accuracy on Blind test: 0.91 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.07663488 0.0694983 0.05319428 0.04964781 0.07832623 0.0508635 0.04934716 0.07398272 0.073277 0.05942822] mean value: 0.0634200096130371 key: score_time value: [0.02299166 0.01402068 0.01389337 0.01395512 0.01391602 0.01397276 0.02293348 0.02380395 0.01953149 0.01425648] mean value: 0.017327499389648438 key: test_mcc value: [0.23393668 0.21899752 0.34444748 0.25400025 0.1717795 0.25400025 0.23664319 0.23664319 0.2466911 0.19682713] mean value: 0.23939663017861 key: train_mcc value: [0.25597326 0.26474332 0.29790362 0.2611946 0.26484691 0.25750387 0.26665917 0.26665917 0.2611946 0.25750387] mean value: 0.26541823911621043 key: test_accuracy value: [0.54887218 0.54887218 0.60606061 0.56060606 0.54545455 0.56060606 0.5530303 0.5530303 0.56818182 0.56060606] mean value: 0.5605320118478013 key: train_accuracy value: [0.56181665 0.56518082 0.58151261 0.56386555 0.56554622 0.56218487 0.56638655 0.56638655 0.56386555 0.56218487] mean value: 0.5658930249980564 key: test_fscore value: [0.6875 0.69072165 0.7173913 0.69473684 0.68085106 0.69473684 0.69109948 0.69109948 0.69518717 0.68478261] mean value: 0.692810642922331 key: train_fscore value: [0.69549971 0.69677419 0.7049763 0.69631363 0.69712947 0.69549971 0.6975381 0.6975381 0.69631363 0.69549971] mean value: 0.6973082556135722 key: test_precision value: [0.52380952 0.52755906 0.55932203 0.53225806 0.52459016 0.53225806 0.528 0.528 0.53719008 0.53389831] mean value: 0.5326885293521997 key: train_precision value: [0.53315412 0.53465347 0.54437328 0.53411131 0.53507194 0.53315412 0.53555356 0.53555356 0.53411131 0.53315412] mean value: 0.5352890789817935 key: test_recall value: [1. 1. 1. 1. 0.96969697 1. 1. 1. 0.98484848 0.95454545] mean value: 0.990909090909091 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.55223881 0.54545455 0.60606061 0.56060606 0.54545455 0.56060606 0.5530303 0.5530303 0.56818182 0.56060606] mean value: 0.5605269109000452 key: train_roc_auc value: [0.56144781 0.56554622 0.58151261 0.56386555 0.56554622 0.56218487 0.56638655 0.56638655 0.56386555 0.56218487] mean value: 0.565892680304445 key: test_jcc value: [0.52380952 0.52755906 0.55932203 0.53225806 0.51612903 0.53225806 0.528 0.528 0.53278689 0.52066116] mean value: 0.5300783816386957 key: train_jcc value: [0.53315412 0.53465347 0.54437328 0.53411131 0.53507194 0.53315412 0.53555356 0.53555356 0.53411131 0.53315412] mean value: 0.5352890789817935 MCC on Blind test: 0.12 Accuracy on Blind test: 0.39 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.06939888 0.05537415 0.05750942 0.05315638 0.05297065 0.04990363 0.04437757 0.03693891 0.05415535 0.05993485] mean value: 0.053371977806091306 key: score_time value: [0.02014852 0.02306843 0.0202558 0.02007866 0.02018118 0.02053142 0.02035832 0.022403 0.02546763 0.02311039] mean value: 0.02156033515930176 key: test_mcc value: [0.68430574 0.64039914 0.74250948 0.78368849 0.75725927 0.71285802 0.75897093 0.67161876 0.76072577 0.74663552] mean value: 0.7258971118039523 key: train_mcc value: [0.77315466 0.78527719 0.77672743 0.78455181 0.7648432 0.78328462 0.77766758 0.7771158 0.7797431 0.77335768] mean value: 0.7775723077305859 key: test_accuracy value: [0.84210526 0.81954887 0.87121212 0.88636364 0.87121212 0.85606061 0.87878788 0.83333333 0.87878788 0.87121212] mean value: 0.8608623832308043 key: train_accuracy value: [0.88561817 0.89150547 0.88739496 0.89159664 0.88151261 0.8907563 0.88823529 0.88739496 0.88907563 0.88571429] mean value: 0.8878804305574206 key: test_fscore value: [0.84210526 0.81538462 0.87022901 0.8951049 0.88275862 0.85925926 0.88235294 0.84285714 0.88405797 0.87769784] mean value: 0.8651807558004633 key: train_fscore value: [0.88961039 0.89537713 0.89123377 0.89469388 0.88545898 0.89430894 0.89125102 0.89158576 0.89250814 0.88961039] mean value: 0.891563839740782 key: test_precision value: [0.8358209 0.84126984 0.87692308 0.83116883 0.81012658 0.84057971 0.85714286 0.7972973 0.84722222 0.83561644] mean value: 0.8373167752326087 key: train_precision value: [0.86028257 0.86384977 0.86185243 0.86984127 0.85691824 0.86614173 0.86783439 0.85959438 0.8657188 0.86028257] mean value: 0.8632316166842141 key: test_recall value: [0.84848485 0.79104478 0.86363636 0.96969697 0.96969697 0.87878788 0.90909091 0.89393939 0.92424242 0.92424242] mean value: 0.8972862957937585 key: train_recall value: [0.9210084 0.92929293 0.92268908 0.9210084 0.91596639 0.92436975 0.91596639 0.92605042 0.9210084 0.9210084 ] mean value: 0.921836855954503 key: test_roc_auc value: [0.84215287 0.81976481 0.87121212 0.88636364 0.87121212 0.85606061 0.87878788 0.83333333 0.87878788 0.87121212] mean value: 0.8608887381275441 key: train_roc_auc value: [0.88558838 0.89153722 0.88739496 0.89159664 0.88151261 0.8907563 0.88823529 0.88739496 0.88907563 0.88571429] mean value: 0.887880626998274 key: test_jcc value: [0.72727273 0.68831169 0.77027027 0.81012658 0.79012346 0.75324675 0.78947368 0.72839506 0.79220779 0.78205128] mean value: 0.7631479298368039 key: train_jcc value: [0.80116959 0.81057269 0.80380673 0.80945347 0.79446064 0.80882353 0.80383481 0.80437956 0.80588235 0.80116959] mean value: 0.8043552968756095 MCC on Blind test: 0.65 Accuracy on Blind test: 0.86 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=169)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.6347847 0.57954597 0.46806121 0.50969672 0.58284903 0.64407444 0.56478858 0.61213517 0.66929221 0.59609604] mean value: 0.5861324071884155 key: score_time value: [0.02030373 0.02027869 0.02028942 0.02028108 0.02024817 0.02016139 0.02032113 0.02037835 0.02039146 0.02017832] mean value: 0.02028317451477051 key: test_mcc value: [0.68430574 0.64039914 0.72760688 0.78368849 0.75725927 0.71285802 0.71285802 0.67161876 0.76072577 0.74663552] mean value: 0.7197955608042741 key: train_mcc value: [0.77315466 0.78527719 0.78968204 0.78455181 0.7648432 0.78328462 0.78762435 0.7771158 0.7797431 0.77335768] mean value: 0.7798634449591065 key: test_accuracy value: [0.84210526 0.81954887 0.86363636 0.88636364 0.87121212 0.85606061 0.85606061 0.83333333 0.87878788 0.87121212] mean value: 0.8578320802005013 key: train_accuracy value: [0.88561817 0.89150547 0.89411765 0.89159664 0.88151261 0.8907563 0.89327731 0.88739496 0.88907563 0.88571429] mean value: 0.8890569011456559 key: test_fscore value: [0.84210526 0.81538462 0.86153846 0.8951049 0.88275862 0.85925926 0.85925926 0.84285714 0.88405797 0.87769784] mean value: 0.8620023329992295 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:196: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_cd_8020.py:199: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.88961039 0.89537713 0.89722675 0.89469388 0.88545898 0.89430894 0.8959869 0.89158576 0.89250814 0.88961039] mean value: 0.8926367258754563 key: test_precision value: [0.8358209 0.84126984 0.875 0.83116883 0.81012658 0.84057971 0.84057971 0.7972973 0.84722222 0.83561644] mean value: 0.835468152840508 key: train_precision value: [0.86028257 0.86384977 0.87163233 0.86984127 0.85691824 0.86614173 0.87380192 0.85959438 0.8657188 0.86028257] mean value: 0.8648063585225085 key: test_recall value: [0.84848485 0.79104478 0.84848485 0.96969697 0.96969697 0.87878788 0.87878788 0.89393939 0.92424242 0.92424242] mean value: 0.8927408412483039 key: train_recall value: [0.9210084 0.92929293 0.92436975 0.9210084 0.91596639 0.92436975 0.91932773 0.92605042 0.9210084 0.9210084 ] mean value: 0.9223410576351753 key: test_roc_auc value: [0.84215287 0.81976481 0.86363636 0.88636364 0.87121212 0.85606061 0.85606061 0.83333333 0.87878788 0.87121212] mean value: 0.8578584350972411 key: train_roc_auc value: [0.88558838 0.89153722 0.89411765 0.89159664 0.88151261 0.8907563 0.89327731 0.88739496 0.88907563 0.88571429] mean value: 0.8890570975865093 key: test_jcc value: [0.72727273 0.68831169 0.75675676 0.81012658 0.79012346 0.75324675 0.75324675 0.72839506 0.79220779 0.78205128] mean value: 0.7581738853890753 key: train_jcc value: [0.80116959 0.81057269 0.81360947 0.80945347 0.79446064 0.80882353 0.8115727 0.80437956 0.80588235 0.80116959] mean value: 0.8061093593256186 MCC on Blind test: 0.65 Accuracy on Blind test: 0.86