/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_orig.py:550: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. from pandas import MultiIndex, Int64Index 1.22.4 1.4.1 aaindex_df contains non-numerical data Total no. of non-numerial columns: 2 Selecting numerical data only PASS: successfully selected numerical columns only for aaindex_df Now checking for NA in the remaining aaindex_cols Counting aaindex_df cols with NA ncols with NA: 4 columns Dropping these... Original ncols: 127 Revised df ncols: 123 Checking NA in revised df... PASS: cols with NA successfully dropped from aaindex_df Proceeding with combining aa_df with other features_df PASS: ncols match Expected ncols: 123 Got: 123 Total no. of columns in clean aa_df: 123 Proceeding to merge, expected nrows in merged_df: 817 PASS: my_features_df and aa_df successfully combined nrows: 817 ncols: 269 count of NULL values before imputation or_mychisq 244 log10_or_mychisq 244 dtype: int64 count of NULL values AFTER imputation mutationinformation 0 or_rawI 0 logorI 0 dtype: int64 PASS: OR values imputed, data ready for ML Total no. of features for aaindex: 123 No. of numerical features: 168 No. of categorical features: 7 index: 0 ind: 1 Mask count check: True index: 1 ind: 2 Mask count check: True Original Data Counter({1: 309, 0: 158}) Data dim: (467, 175) ------------------------------------------------------------- Successfully split data: ORIGINAL training actual values: training set imputed values: blind test set Train data size: (467, 175) Test data size: (350, 175) y_train numbers: Counter({1: 309, 0: 158}) y_train ratio: 0.511326860841424 y_test_numbers: Counter({0: 315, 1: 35}) y_test ratio: 9.0 ------------------------------------------------------------- Simple Random OverSampling Counter({1: 309, 0: 309}) (618, 175) Simple Random UnderSampling Counter({0: 158, 1: 158}) (316, 175) Simple Combined Over and UnderSampling Counter({0: 309, 1: 309}) (618, 175) SMOTE_NC OverSampling Counter({1: 309, 0: 309}) (618, 175) ##################################################################### Running ML analysis: ORIGINAL Gene name: katG Drug name: isoniazid Output directory: /home/tanu/git/Data/isoniazid/output/ml/tts_orig/ Sanity checks: Total input features: 175 Training data size: (467, 175) Test data size: (350, 175) Target feature numbers (training data): Counter({1: 309, 0: 158}) Target features ratio (training data: 0.511326860841424 Target feature numbers (test data): Counter({0: 315, 1: 35}) Target features ratio (test data): 9.0 ##################################################################### ================================================================ Strucutral features (n): 36 These are: Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist'] FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss'] Other struc columns: ['rsa', 'kd_values', 'rd_values'] ================================================================ AAindex features (n): 123 These are: ['ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'] ================================================================ Evolutionary features (n): 3 These are: ['consurf_score', 'snap2_score', 'provean_score'] ================================================================ Genomic features (n): 6 These are: ['maf', 'logorI'] ['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'] ================================================================ Categorical features (n): 7 These are: ['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'] ================================================================ Pass: No. of features match ##################################################################### Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.04550624 0.03566933 0.03764677 0.03806186 0.03683448 0.03676414 0.03676653 0.03999543 0.03731942 0.04414487] mean value: 0.03887090682983398 key: score_time value: [0.01246476 0.01227474 0.01236653 0.01562428 0.01560974 0.01561832 0.01560354 0.01569724 0.01537657 0.01530743] mean value: 0.014594316482543945 key: test_mcc value: [0.90662544 0.66402366 0.65994312 0.90662544 0.8084425 0.66337469 0.72715272 0.75776742 0.54774009 0.75806977] mean value: 0.7399764867716851 key: train_mcc value: [0.85008968 0.83374086 0.82797794 0.79517432 0.8061574 0.83910661 0.82837741 0.80706626 0.81804827 0.82292436] mean value: 0.822866310204347 key: test_accuracy value: [0.95744681 0.85106383 0.85106383 0.95744681 0.91489362 0.85106383 0.87234043 0.89130435 0.80434783 0.89130435] mean value: 0.88422756706753 key: train_accuracy value: [0.93333333 0.92619048 0.92380952 0.90952381 0.91428571 0.92857143 0.92380952 0.91448931 0.9192399 0.9216152 ] mean value: 0.9214868227576066 key: test_fscore value: [0.96875 0.88888889 0.89230769 0.96875 0.9375 0.89552239 0.9 0.91803279 0.85714286 0.92063492] mean value: 0.9147529533919306 key: train_fscore value: [0.95104895 0.94589878 0.94385965 0.93356643 0.93706294 0.94755245 0.94425087 0.93684211 0.94055944 0.9426087 ] mean value: 0.9423250309268 key: test_precision value: [0.93939394 0.875 0.85294118 0.93939394 0.90909091 0.83333333 0.93103448 0.93333333 0.84375 0.87878788] mean value: 0.8936058992562542 key: train_precision value: [0.92517007 0.91864407 0.92123288 0.90816327 0.91156463 0.92176871 0.91554054 0.91438356 0.91496599 0.91554054] mean value: 0.916697424029508 key: test_recall value: [1. 0.90322581 0.93548387 1. 0.96774194 0.96774194 0.87096774 0.90322581 0.87096774 0.96666667] mean value: 0.9386021505376344 key: train_recall value: [0.97841727 0.97482014 0.9676259 0.96043165 0.96402878 0.97482014 0.97482014 0.96043165 0.9676259 0.97132616] mean value: 0.9694347747608365 key: test_roc_auc value: [0.9375 0.8266129 0.81149194 0.9375 0.89012097 0.79637097 0.87298387 0.88494624 0.7688172 0.85833333] mean value: 0.8584677419354839 key: train_roc_auc value: [0.91174384 0.90290303 0.90282703 0.8851454 0.89046509 0.90642416 0.8993819 0.89280324 0.89640036 0.89763491] mean value: 0.8985728980669149 key: test_jcc value: [0.93939394 0.8 0.80555556 0.93939394 0.88235294 0.81081081 0.81818182 0.84848485 0.75 0.85294118] mean value: 0.8447115029467971 key: train_jcc value: [0.90666667 0.89735099 0.89368771 0.87540984 0.88157895 0.90033223 0.89438944 0.88118812 0.88778878 0.89144737] mean value: 0.8909840082087678 MCC on Blind test: 0.24 Accuracy on Blind test: 0.48 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.99850392 0.90923595 0.88660312 1.0262289 0.91332746 0.87034369 0.93770385 0.96103406 0.87058878 0.85822821] mean value: 0.9231797933578492 key: score_time value: [0.01522923 0.01579213 0.01238084 0.01559043 0.01568437 0.01571655 0.01861119 0.01919341 0.01918507 0.01683927] mean value: 0.016422247886657713 key: test_mcc value: [0.90524194 0.8566725 0.95436677 0.8566725 0.90662544 0.76032282 0.81048387 0.85513419 0.79930604 0.72379255] mean value: 0.8428618620895384 key: train_mcc value: [0.97336948 1. 1. 0.9680267 0.96269263 0.97870346 0.97336948 0.96817602 0.96817602 1. ] mean value: 0.979251380217927 key: test_accuracy value: [0.95744681 0.93617021 0.9787234 0.93617021 0.95744681 0.89361702 0.91489362 0.93478261 0.91304348 0.86956522] mean value: 0.9291859389454209 key: train_accuracy value: [0.98809524 1. 1. 0.98571429 0.98333333 0.99047619 0.98809524 0.98574822 0.98574822 1. ] mean value: 0.9907210722768918 key: test_fscore value: [0.96774194 0.95238095 0.98360656 0.95238095 0.96875 0.92307692 0.93548387 0.95081967 0.9375 0.89655172] mean value: 0.9468292587936569 key: train_fscore value: [0.99102334 1. 1. 0.98924731 0.98747764 0.99283154 0.99102334 0.98924731 0.98924731 1. ] mean value: 0.9930097793978486 key: test_precision value: [0.96774194 0.9375 1. 0.9375 0.93939394 0.88235294 0.93548387 0.96666667 0.90909091 0.92857143] mean value: 0.9404301691351027 key: train_precision value: [0.98924731 1. 1. 0.98571429 0.98220641 0.98928571 0.98924731 0.98571429 0.98571429 1. ] mean value: 0.9907129600778436 key: test_recall value: [0.96774194 0.96774194 0.96774194 0.96774194 1. 0.96774194 0.93548387 0.93548387 0.96774194 0.86666667] mean value: 0.9544086021505377 key: train_recall value: [0.99280576 1. 1. 0.99280576 0.99280576 0.99640288 0.99280576 0.99280576 0.99280576 1. ] mean value: 0.9953237410071942 key: test_roc_auc value: [0.95262097 0.92137097 0.98387097 0.92137097 0.9375 0.85887097 0.90524194 0.9344086 0.88387097 0.87083333] mean value: 0.9169959677419355 key: train_roc_auc value: [0.9858395 1. 1. 0.98231837 0.97879724 0.98763806 0.9858395 0.98241686 0.98241686 1. ] mean value: 0.9885266395373803 key: test_jcc value: [0.9375 0.90909091 0.96774194 0.90909091 0.93939394 0.85714286 0.87878788 0.90625 0.88235294 0.8125 ] mean value: 0.8999851370166835 key: train_jcc value: [0.98220641 1. 1. 0.9787234 0.97526502 0.98576512 0.98220641 0.9787234 0.9787234 1. ] mean value: 0.9861613166376862 MCC on Blind test: 0.15 Accuracy on Blind test: 0.4 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01437426 0.01120424 0.0100081 0.00968909 0.00983858 0.00988436 0.01077795 0.01006675 0.00971889 0.01089692] mean value: 0.010645914077758788 key: score_time value: [0.01239276 0.00954723 0.00908661 0.00897408 0.00899839 0.00888991 0.00891042 0.00963974 0.00894403 0.00965524] mean value: 0.009503841400146484 key: test_mcc value: [0.48712471 0.31590883 0.56769924 0.62096774 0.48712471 0.76032282 0.59764284 0.54667108 0.33208342 0.33864811] mean value: 0.5054193489018697 key: train_mcc value: [0.50746207 0.58785983 0.62257686 0.51310898 0.53815026 0.61847806 0.59992952 0.56870287 0.61596238 0.53164607] mean value: 0.5703876902905627 key: test_accuracy value: [0.76595745 0.65957447 0.80851064 0.82978723 0.76595745 0.89361702 0.80851064 0.7826087 0.69565217 0.67391304] mean value: 0.76840888066605 key: train_accuracy value: [0.78333333 0.80952381 0.82142857 0.77380952 0.78809524 0.81904762 0.81190476 0.79809976 0.81947743 0.75771971] mean value: 0.7982439769256872 key: test_fscore value: [0.81967213 0.71428571 0.85714286 0.87096774 0.81967213 0.92307692 0.84745763 0.82758621 0.76666667 0.72727273] mean value: 0.817380072669065 key: train_fscore value: [0.83950617 0.85185185 0.85875706 0.82309125 0.83609576 0.85660377 0.85178236 0.8411215 0.8576779 0.79518072] mean value: 0.8411668357185847 key: test_precision value: [0.83333333 0.8 0.84375 0.87096774 0.83333333 0.88235294 0.89285714 0.88888889 0.79310345 0.8 ] mean value: 0.8438586829800515 key: train_precision value: [0.82352941 0.8778626 0.90118577 0.85328185 0.85660377 0.90079365 0.89019608 0.87548638 0.89453125 0.90410959] mean value: 0.8777580354391377 key: test_recall value: [0.80645161 0.64516129 0.87096774 0.87096774 0.80645161 0.96774194 0.80645161 0.77419355 0.74193548 0.66666667] mean value: 0.7956989247311828 key: train_recall value: [0.85611511 0.82733813 0.82014388 0.79496403 0.81654676 0.81654676 0.81654676 0.80935252 0.82374101 0.70967742] mean value: 0.8090972383383616 key: test_roc_auc value: [0.74697581 0.66633065 0.77923387 0.81048387 0.74697581 0.85887097 0.80947581 0.78709677 0.67096774 0.67708333] mean value: 0.7553494623655914 key: train_roc_auc value: [0.74848009 0.80099301 0.82204377 0.7636792 0.77447056 0.82024521 0.80968183 0.79278815 0.81746491 0.78089505] mean value: 0.793074178117275 key: test_jcc value: [0.69444444 0.55555556 0.75 0.77142857 0.69444444 0.85714286 0.73529412 0.70588235 0.62162162 0.57142857] mean value: 0.6957242536654301 key: train_jcc value: [0.72340426 0.74193548 0.75247525 0.69936709 0.71835443 0.74917492 0.74183007 0.72580645 0.75081967 0.66 ] mean value: 0.7263167612297488 MCC on Blind test: 0.17 Accuracy on Blind test: 0.49 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01024628 0.01106691 0.01009178 0.01026082 0.01036048 0.01026225 0.01013184 0.01017618 0.01020217 0.01028705] mean value: 0.010308575630187989 key: score_time value: [0.00911188 0.00916982 0.00888133 0.0091939 0.00980973 0.00894856 0.00905204 0.00923276 0.00895095 0.00896621] mean value: 0.009131717681884765 key: test_mcc value: [0.72715272 0.52620968 0.50611184 0.71025956 0.71206211 0.56329266 0.55956342 0.49033059 0.24538756 0.65669997] mean value: 0.5697070127041126 key: train_mcc value: [0.60428127 0.6506538 0.68534362 0.63499734 0.65670743 0.66210484 0.65614514 0.67599229 0.67555291 0.64020363] mean value: 0.654198226500829 key: test_accuracy value: [0.87234043 0.78723404 0.78723404 0.87234043 0.87234043 0.80851064 0.80851064 0.7826087 0.67391304 0.84782609] mean value: 0.8112858464384829 key: train_accuracy value: [0.82619048 0.84761905 0.86190476 0.84047619 0.85 0.85238095 0.85 0.85748219 0.85748219 0.8432304 ] mean value: 0.8486766202918222 key: test_fscore value: [0.9 0.83870968 0.84848485 0.90625 0.90909091 0.86956522 0.86153846 0.84375 0.76190476 0.88888889] mean value: 0.8628182764718529 key: train_fscore value: [0.87170475 0.88965517 0.8986014 0.88347826 0.89081456 0.89273356 0.89156627 0.8951049 0.89547038 0.8862069 ] mean value: 0.8895336139116604 key: test_precision value: [0.93103448 0.83870968 0.8 0.87878788 0.85714286 0.78947368 0.82352941 0.81818182 0.75 0.84848485] mean value: 0.8335344658750611 key: train_precision value: [0.85223368 0.85430464 0.87414966 0.85521886 0.85953177 0.86 0.85478548 0.8707483 0.86824324 0.8538206 ] mean value: 0.8603036219513056 key: test_recall value: [0.87096774 0.83870968 0.90322581 0.93548387 0.96774194 0.96774194 0.90322581 0.87096774 0.77419355 0.93333333] mean value: 0.8965591397849463 key: train_recall value: [0.89208633 0.92805755 0.92446043 0.91366906 0.92446043 0.92805755 0.93165468 0.92086331 0.92446043 0.92114695] mean value: 0.920891673757768 key: test_roc_auc value: [0.87298387 0.76310484 0.7328629 0.84274194 0.82762097 0.73387097 0.7641129 0.73548387 0.62043011 0.81041667] mean value: 0.7703629032258065 key: train_roc_auc value: [0.79463471 0.8090992 0.83194853 0.80542608 0.81434289 0.81614145 0.81089776 0.82756452 0.82586658 0.8056439 ] mean value: 0.8141565627727084 key: test_jcc value: [0.81818182 0.72222222 0.73684211 0.82857143 0.83333333 0.76923077 0.75675676 0.72972973 0.61538462 0.8 ] mean value: 0.7610252778673832 key: train_jcc value: [0.77258567 0.80124224 0.81587302 0.79127726 0.803125 0.80625 0.80434783 0.81012658 0.81072555 0.79566563] mean value: 0.8011218775337603 MCC on Blind test: 0.24 Accuracy on Blind test: 0.5 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00970244 0.00998521 0.01048708 0.00994468 0.00962806 0.0105741 0.01067281 0.01052999 0.00967312 0.01057601] mean value: 0.010177350044250489 key: score_time value: [0.0793097 0.01351643 0.01241183 0.01221108 0.01160502 0.0174439 0.01262283 0.01519203 0.01503754 0.01247334] mean value: 0.020182371139526367 key: test_mcc value: [0.4512753 0.60908698 0.557325 0.60908698 0.76034808 0.17507316 0.50611184 0.47977675 0.35831956 0.39770584] mean value: 0.4904109472286944 key: train_mcc value: [0.65026131 0.59308253 0.62246377 0.59871999 0.57608635 0.62868128 0.60446554 0.62393453 0.67621075 0.65060935] mean value: 0.6224515401625214 key: test_accuracy value: [0.76595745 0.82978723 0.80851064 0.82978723 0.89361702 0.65957447 0.78723404 0.7826087 0.73913043 0.73913043] mean value: 0.7835337650323774 key: train_accuracy value: [0.84761905 0.82380952 0.83571429 0.82619048 0.81666667 0.83809524 0.82857143 0.83610451 0.85748219 0.847981 ] mean value: 0.8358234362628661 key: test_fscore value: [0.8358209 0.87878788 0.86567164 0.87878788 0.92063492 0.76470588 0.84848485 0.84848485 0.82352941 0.8125 ] mean value: 0.8477408206611455 key: train_fscore value: [0.89078498 0.87414966 0.88123924 0.87606112 0.86882453 0.88235294 0.87878788 0.8836425 0.8989899 0.89115646] mean value: 0.8825989214867034 key: test_precision value: [0.77777778 0.82857143 0.80555556 0.82857143 0.90625 0.7027027 0.8 0.8 0.75675676 0.76470588] mean value: 0.7970891532288591 key: train_precision value: [0.8474026 0.82903226 0.84488449 0.82958199 0.82524272 0.85 0.82594937 0.83174603 0.84493671 0.84789644] mean value: 0.8376672603756541 key: test_recall value: [0.90322581 0.93548387 0.93548387 0.93548387 0.93548387 0.83870968 0.90322581 0.90322581 0.90322581 0.86666667] mean value: 0.9060215053763441 key: train_recall value: [0.93884892 0.92446043 0.92086331 0.92805755 0.91726619 0.91726619 0.93884892 0.94244604 0.96043165 0.9390681 ] mean value: 0.9327557308991516 key: test_roc_auc value: [0.7016129 0.78024194 0.74899194 0.78024194 0.87399194 0.57560484 0.7328629 0.71827957 0.6516129 0.68333333] mean value: 0.7246774193548388 key: train_roc_auc value: [0.8039315 0.7756105 0.7949387 0.77740906 0.76849225 0.80018239 0.77576249 0.78590834 0.80888716 0.80404109] mean value: 0.7895163466866486 key: test_jcc value: [0.71794872 0.78378378 0.76315789 0.78378378 0.85294118 0.61904762 0.73684211 0.73684211 0.7 0.68421053] mean value: 0.737855771261344 key: train_jcc value: [0.80307692 0.77643505 0.78769231 0.77945619 0.76807229 0.78947368 0.78378378 0.79154079 0.81651376 0.80368098] mean value: 0.7899725755152334 MCC on Blind test: 0.16 Accuracy on Blind test: 0.39 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.02232766 0.01876426 0.02162337 0.01979256 0.02090979 0.0185473 0.01879716 0.01888871 0.01875162 0.01901221] mean value: 0.019741463661193847 key: score_time value: [0.01124477 0.01115179 0.01221538 0.0110321 0.01119208 0.01118159 0.01107144 0.01110148 0.01091552 0.01118755] mean value: 0.011229372024536133 key: test_mcc value: [0.86070252 0.72363572 0.557325 0.71206211 0.66337469 0.6139232 0.61207663 0.59332241 0.30795894 0.72168784] mean value: 0.6366069048812995 key: train_mcc value: [0.69610881 0.69057002 0.70716866 0.69159168 0.69057002 0.73588387 0.70257433 0.67685276 0.72049649 0.69640412] mean value: 0.7008220750346401 key: test_accuracy value: [0.93617021 0.87234043 0.80851064 0.87234043 0.85106383 0.82978723 0.82978723 0.82608696 0.7173913 0.86956522] mean value: 0.841304347826087 key: train_accuracy value: [0.86666667 0.86428571 0.87142857 0.86428571 0.86428571 0.88333333 0.86904762 0.85748219 0.87648456 0.86698337] mean value: 0.8684283452098179 key: test_fscore value: [0.95384615 0.91176471 0.86567164 0.90909091 0.89552239 0.88235294 0.875 0.875 0.80597015 0.90909091] mean value: 0.8883309798191273 key: train_fscore value: [0.9047619 0.90322581 0.90784983 0.90387858 0.90322581 0.9165247 0.90693739 0.89932886 0.91156463 0.90508475] mean value: 0.9062382257284957 key: test_precision value: [0.91176471 0.83783784 0.80555556 0.85714286 0.83333333 0.81081081 0.84848485 0.84848485 0.75 0.83333333] mean value: 0.8336748130865778 key: train_precision value: [0.85806452 0.85530547 0.86363636 0.85079365 0.85530547 0.87055016 0.85623003 0.8427673 0.86451613 0.8585209 ] mean value: 0.8575689981747396 key: test_recall value: [1. 1. 0.93548387 0.96774194 0.96774194 0.96774194 0.90322581 0.90322581 0.87096774 1. ] mean value: 0.9516129032258065 key: train_recall value: [0.95683453 0.95683453 0.95683453 0.96402878 0.95683453 0.9676259 0.96402878 0.96402878 0.96402878 0.95698925] mean value: 0.9608068384002476 key: test_roc_auc value: [0.90625 0.8125 0.74899194 0.82762097 0.79637097 0.76512097 0.7953629 0.78494624 0.63548387 0.8125 ] mean value: 0.7885147849462366 key: train_roc_auc value: [0.82348769 0.81996656 0.83052994 0.81652143 0.81996656 0.84296788 0.82356368 0.80718921 0.83516124 0.82356505] mean value: 0.8242919250604606 key: test_jcc value: [0.91176471 0.83783784 0.76315789 0.83333333 0.81081081 0.78947368 0.77777778 0.77777778 0.675 0.83333333] mean value: 0.8010267155700592 key: train_jcc value: [0.82608696 0.82352941 0.83125 0.82461538 0.82352941 0.84591195 0.82972136 0.81707317 0.8375 0.82662539] mean value: 0.8285843034309783 MCC on Blind test: 0.23 Accuracy on Blind test: 0.42 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [0.41509366 1.89070988 0.35135531 1.76246381 1.76783442 1.66246915 1.52473354 0.69678211 0.60859632 1.48020411] mean value: 1.2160242319107055 key: score_time value: [0.01236773 0.01505613 0.01239443 0.01246619 0.02640224 0.01245737 0.01242447 0.01241732 0.01241112 0.01234865] mean value: 0.014074563980102539 key: test_mcc value: [0.90662544 0.67402153 0.55956342 0.95299692 0.95299692 0.66337469 0.71572581 0.75776742 0.44695591 0.61666667] mean value: 0.7246694736926886 key: train_mcc value: [0.75124204 0.98408226 0.78027884 0.95249586 0.9680267 0.95736701 0.90500503 0.83068165 0.81289932 0.95227009] mean value: 0.8894348800248977 key: test_accuracy value: [0.95744681 0.85106383 0.80851064 0.9787234 0.9787234 0.85106383 0.87234043 0.89130435 0.76086957 0.82608696] mean value: 0.877613320999075 key: train_accuracy value: [0.89047619 0.99285714 0.90238095 0.97857143 0.98571429 0.98095238 0.95714286 0.9239905 0.91686461 0.97862233] mean value: 0.9507572672774574 key: test_fscore value: [0.96875 0.8852459 0.86153846 0.98412698 0.98412698 0.89552239 0.90322581 0.91803279 0.82539683 0.86666667] mean value: 0.9092632804891827 key: train_fscore value: [0.91958042 0.99459459 0.92691622 0.9840708 0.98924731 0.98571429 0.96853147 0.94482759 0.93913043 0.9840708 ] mean value: 0.9636683915192453 key: test_precision value: [0.93939394 0.9 0.82352941 0.96875 0.96875 0.83333333 0.90322581 0.93333333 0.8125 0.86666667] mean value: 0.8949482490943592 key: train_precision value: [0.89455782 0.99638989 0.91872792 0.96864111 0.98571429 0.9787234 0.94217687 0.90728477 0.90909091 0.97202797] mean value: 0.9473334955051633 key: test_recall value: [1. 0.87096774 0.90322581 1. 1. 0.96774194 0.90322581 0.90322581 0.83870968 0.86666667] mean value: 0.9253763440860215 key: train_recall value: [0.94604317 0.99280576 0.9352518 1. 0.99280576 0.99280576 0.99640288 0.98561151 0.97122302 0.99641577] mean value: 0.9809365410897088 key: test_roc_auc value: [0.9375 0.84173387 0.7641129 0.96875 0.96875 0.79637097 0.8578629 0.88494624 0.71935484 0.80833333] mean value: 0.854771505376344 key: train_roc_auc value: [0.86386665 0.99288175 0.88663998 0.96830986 0.98231837 0.97527612 0.93834228 0.89490366 0.89120592 0.97003887] mean value: 0.9363783463845078 key: test_jcc value: [0.93939394 0.79411765 0.75675676 0.96875 0.96875 0.81081081 0.82352941 0.84848485 0.7027027 0.76470588] mean value: 0.8378001999325528 key: train_jcc value: [0.85113269 0.98924731 0.86378738 0.96864111 0.9787234 0.97183099 0.93898305 0.89542484 0.8852459 0.96864111] mean value: 0.931165778255146 MCC on Blind test: 0.13 Accuracy on Blind test: 0.37 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02629733 0.02072287 0.01916671 0.02101421 0.01953888 0.02145672 0.02063489 0.02003598 0.02033472 0.01894474] mean value: 0.02081470489501953 key: score_time value: [0.01246381 0.00959682 0.00929546 0.00943065 0.00951147 0.00888705 0.00958252 0.0094862 0.00885487 0.00886083] mean value: 0.009596967697143554 key: test_mcc value: [0.90524194 0.81048387 1. 0.91188882 0.90662544 0.76032282 1. 0.86757603 0.90107527 0.80651412] mean value: 0.8869728313481567 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95744681 0.91489362 1. 0.95744681 0.95744681 0.89361702 1. 0.93478261 0.95652174 0.91304348] mean value: 0.9485198889916744 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.96774194 0.93548387 1. 0.96666667 0.96875 0.92307692 1. 0.94915254 0.96774194 0.93548387] mean value: 0.9614097745019696 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96774194 0.93548387 1. 1. 0.93939394 0.88235294 1. 1. 0.96774194 0.90625 ] mean value: 0.9598964622505894 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 1. 0.93548387 1. 0.96774194 1. 0.90322581 0.96774194 0.96666667] mean value: 0.9644086021505376 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.95262097 0.90524194 1. 0.96774194 0.9375 0.85887097 1. 0.9516129 0.95053763 0.88958333] mean value: 0.9413709677419355 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9375 0.87878788 1. 0.93548387 0.93939394 0.85714286 1. 0.90322581 0.9375 0.87878788] mean value: 0.9267822231531909 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.11682916 0.11507559 0.12848043 0.12166643 0.11733985 0.11582923 0.11812663 0.117419 0.11527824 0.11577892] mean value: 0.11818234920501709 key: score_time value: [0.01781154 0.01825404 0.01940131 0.01899099 0.01792264 0.01776505 0.01769257 0.01761508 0.01792526 0.01881552] mean value: 0.018219399452209472 key: test_mcc value: [0.90662544 0.76034808 0.76942439 0.81048387 0.76942439 0.50421069 0.66402366 0.79930604 0.59332241 0.7073172 ] mean value: 0.7284486166045462 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95744681 0.89361702 0.89361702 0.91489362 0.89361702 0.78723404 0.85106383 0.91304348 0.82608696 0.86956522] mean value: 0.8800185013876041 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.96875 0.92063492 0.92537313 0.93548387 0.92537313 0.85294118 0.88888889 0.9375 0.875 0.90322581] mean value: 0.9133170932070469 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.93939394 0.90625 0.86111111 0.93548387 0.86111111 0.78378378 0.875 0.90909091 0.84848485 0.875 ] mean value: 0.8794709573943444 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.93548387 1. 0.93548387 1. 0.93548387 0.90322581 0.96774194 0.90322581 0.93333333] mean value: 0.9513978494623656 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9375 0.87399194 0.84375 0.90524194 0.84375 0.71774194 0.8266129 0.88387097 0.78494624 0.84166667] mean value: 0.8459072580645162 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.93939394 0.85294118 0.86111111 0.87878788 0.86111111 0.74358974 0.8 0.88235294 0.77777778 0.82352941] mean value: 0.8420595091183327 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.17 Accuracy on Blind test: 0.35 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01028204 0.01027584 0.01097631 0.01002836 0.01014161 0.01016474 0.00999165 0.01011252 0.01021099 0.01017547] mean value: 0.010235953330993652 key: score_time value: [0.00873184 0.00867677 0.00888753 0.00895524 0.0091784 0.00871277 0.00871825 0.0088079 0.00915551 0.00878716] mean value: 0.008861136436462403 key: test_mcc value: [0.47146788 0.66337469 0.30022788 0.48712471 0.50421069 0.13312621 0.43145161 0.47977675 0.23600897 0.61666667] mean value: 0.43234360607693584 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.74468085 0.85106383 0.68085106 0.76595745 0.78723404 0.63829787 0.74468085 0.7826087 0.65217391 0.82608696] mean value: 0.7473635522664199 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.79310345 0.89552239 0.75409836 0.81967213 0.85294118 0.74626866 0.80645161 0.84848485 0.73333333 0.86666667] mean value: 0.8116542622713923 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85185185 0.83333333 0.76666667 0.83333333 0.78378378 0.69444444 0.80645161 0.8 0.75862069 0.86666667] mean value: 0.7995152382638478 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.74193548 0.96774194 0.74193548 0.80645161 0.93548387 0.80645161 0.80645161 0.90322581 0.70967742 0.86666667] mean value: 0.8286021505376344 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.74596774 0.79637097 0.65221774 0.74697581 0.71774194 0.55947581 0.71572581 0.71827957 0.62150538 0.80833333] mean value: 0.7082594086021505 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.65714286 0.81081081 0.60526316 0.69444444 0.74358974 0.5952381 0.67567568 0.73684211 0.57894737 0.76470588] mean value: 0.6862660140833515 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.21 Accuracy on Blind test: 0.5 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.78322387 1.70288587 1.75905633 1.67823339 1.76473522 1.75426316 1.67606902 1.68830752 1.72714567 1.77702308] mean value: 1.7310943126678466 key: score_time value: [0.0979917 0.09790587 0.09109807 0.09100127 0.09731817 0.0949614 0.09058595 0.09162045 0.09331584 0.09899282] mean value: 0.0944791555404663 key: test_mcc value: [0.95299692 0.86070252 0.90662544 0.95436677 0.90662544 0.81503725 0.91188882 0.90107527 0.90229785 0.80651412] mean value: 0.8918130408374696 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9787234 0.93617021 0.95744681 0.9787234 0.95744681 0.91489362 0.95744681 0.95652174 0.95652174 0.91304348] mean value: 0.9506938020351526 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.95384615 0.96875 0.98360656 0.96875 0.93939394 0.96666667 0.96774194 0.96875 0.93548387] mean value: 0.9637116107862406 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.91176471 0.93939394 1. 0.93939394 0.88571429 1. 0.96774194 0.93939394 0.90625 ] mean value: 0.9458402745262328 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 0.96774194 1. 1. 0.93548387 0.96774194 1. 0.96666667] mean value: 0.983763440860215 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96875 0.90625 0.9375 0.98387097 0.9375 0.875 0.96774194 0.95053763 0.93333333 0.88958333] mean value: 0.9350067204301076 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.91176471 0.93939394 0.96774194 0.93939394 0.88571429 0.93548387 0.9375 0.93939394 0.87878788] mean value: 0.9303924495017949 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.09 Accuracy on Blind test: 0.21 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000...05', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: fit_time value: [1.85024405 1.00775695 1.08623052 1.03636193 0.93481803 1.00752449 0.95746088 0.99271202 0.97024059 1.0156703 ] mean value: 1.0859019756317139 key: score_time value: [0.22596669 0.28282785 0.23712778 0.22467446 0.17264795 0.14795899 0.25987649 0.2267096 0.27685618 0.23671722] mean value: 0.229136323928833 key: test_mcc value: [0.95299692 0.86070252 0.90662544 0.95299692 0.90662544 0.76942439 0.86091836 0.85009261 0.8059304 0.75806977] mean value: 0.8624382772371741 key: train_mcc value: [0.93680867 0.95249586 0.94725945 0.94725945 0.95249586 0.95249586 0.94725945 0.94751034 0.94751034 0.96303439] mean value: 0.9494129679989223 key: test_accuracy value: [0.9787234 0.93617021 0.95744681 0.9787234 0.95744681 0.89361702 0.93617021 0.93478261 0.91304348 0.89130435] mean value: 0.9377428307123035 key: train_accuracy value: [0.97142857 0.97857143 0.97619048 0.97619048 0.97857143 0.97857143 0.97619048 0.97624703 0.97624703 0.98337292] mean value: 0.9771581269087207 key: test_fscore value: [0.98412698 0.95384615 0.96875 0.98412698 0.96875 0.92537313 0.95081967 0.95238095 0.93939394 0.92063492] mean value: 0.954820274096944 key: train_fscore value: [0.97887324 0.9840708 0.98233216 0.98233216 0.9840708 0.9840708 0.98233216 0.98233216 0.98233216 0.98761062] mean value: 0.9830357025671337 key: test_precision value: [0.96875 0.91176471 0.93939394 0.96875 0.93939394 0.86111111 0.96666667 0.9375 0.88571429 0.87878788] mean value: 0.9257832526950174 key: train_precision value: [0.95862069 0.96864111 0.96527778 0.96527778 0.96864111 0.96864111 0.96527778 0.96527778 0.96527778 0.97552448] mean value: 0.9666457399016273 key: test_recall value: [1. 1. 1. 1. 1. 1. 0.93548387 0.96774194 1. 0.96666667] mean value: 0.9869892473118279 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96875 0.90625 0.9375 0.96875 0.9375 0.84375 0.93649194 0.9172043 0.86666667 0.85833333] mean value: 0.9141196236559139 key: train_roc_auc value: [0.95774648 0.96830986 0.96478873 0.96478873 0.96830986 0.96830986 0.96478873 0.96503497 0.96503497 0.97535211] mean value: 0.9662464296267113 key: test_jcc value: [0.96875 0.91176471 0.93939394 0.96875 0.93939394 0.86111111 0.90625 0.90909091 0.88571429 0.85294118] mean value: 0.9143160067057126 key: train_jcc value: [0.95862069 0.96864111 0.96527778 0.96527778 0.96864111 0.96864111 0.96527778 0.96527778 0.96527778 0.97552448] mean value: 0.9666457399016273 MCC on Blind test: 0.1 Accuracy on Blind test: 0.22 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02664924 0.01019883 0.010216 0.01071167 0.01019812 0.01019883 0.01137066 0.0105741 0.01056242 0.01117802] mean value: 0.012185788154602051 key: score_time value: [0.01141644 0.00912237 0.00920796 0.00904679 0.00896335 0.00902796 0.00981045 0.00913143 0.00937605 0.01008081] mean value: 0.00951836109161377 key: test_mcc value: [0.72715272 0.52620968 0.50611184 0.71025956 0.71206211 0.56329266 0.55956342 0.49033059 0.24538756 0.65669997] mean value: 0.5697070127041126 key: train_mcc value: [0.60428127 0.6506538 0.68534362 0.63499734 0.65670743 0.66210484 0.65614514 0.67599229 0.67555291 0.64020363] mean value: 0.654198226500829 key: test_accuracy value: [0.87234043 0.78723404 0.78723404 0.87234043 0.87234043 0.80851064 0.80851064 0.7826087 0.67391304 0.84782609] mean value: 0.8112858464384829 key: train_accuracy value: [0.82619048 0.84761905 0.86190476 0.84047619 0.85 0.85238095 0.85 0.85748219 0.85748219 0.8432304 ] mean value: 0.8486766202918222 key: test_fscore value: [0.9 0.83870968 0.84848485 0.90625 0.90909091 0.86956522 0.86153846 0.84375 0.76190476 0.88888889] mean value: 0.8628182764718529 key: train_fscore value: [0.87170475 0.88965517 0.8986014 0.88347826 0.89081456 0.89273356 0.89156627 0.8951049 0.89547038 0.8862069 ] mean value: 0.8895336139116604 key: test_precision value: [0.93103448 0.83870968 0.8 0.87878788 0.85714286 0.78947368 0.82352941 0.81818182 0.75 0.84848485] mean value: 0.8335344658750611 key: train_precision value: [0.85223368 0.85430464 0.87414966 0.85521886 0.85953177 0.86 0.85478548 0.8707483 0.86824324 0.8538206 ] mean value: 0.8603036219513056 key: test_recall value: [0.87096774 0.83870968 0.90322581 0.93548387 0.96774194 0.96774194 0.90322581 0.87096774 0.77419355 0.93333333] mean value: 0.8965591397849463 key: train_recall value: [0.89208633 0.92805755 0.92446043 0.91366906 0.92446043 0.92805755 0.93165468 0.92086331 0.92446043 0.92114695] mean value: 0.920891673757768 key: test_roc_auc value: [0.87298387 0.76310484 0.7328629 0.84274194 0.82762097 0.73387097 0.7641129 0.73548387 0.62043011 0.81041667] mean value: 0.7703629032258065 key: train_roc_auc value: [0.79463471 0.8090992 0.83194853 0.80542608 0.81434289 0.81614145 0.81089776 0.82756452 0.82586658 0.8056439 ] mean value: 0.8141565627727084 key: test_jcc value: [0.81818182 0.72222222 0.73684211 0.82857143 0.83333333 0.76923077 0.75675676 0.72972973 0.61538462 0.8 ] mean value: 0.7610252778673832 key: train_jcc value: [0.77258567 0.80124224 0.81587302 0.79127726 0.803125 0.80625 0.80434783 0.81012658 0.81072555 0.79566563] mean value: 0.8011218775337603 MCC on Blind test: 0.24 Accuracy on Blind test: 0.5 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.1078825 0.06452894 0.10497737 0.06652641 0.06726933 0.06478691 0.07195568 0.22934031 0.06012368 0.06932712] mean value: 0.09067182540893555 key: score_time value: [0.01200962 0.01149297 0.01164556 0.0111506 0.01142526 0.01123977 0.01101708 0.01149511 0.01077366 0.01080608] mean value: 0.011305570602416992 key: test_mcc value: [0.95299692 0.8566725 1. 1. 0.90662544 0.8084425 1. 0.9085301 0.95087679 0.80833333] mean value: 0.9192477588152645 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9787234 0.93617021 1. 1. 0.95744681 0.91489362 1. 0.95652174 0.97826087 0.91304348] mean value: 0.9635060129509714 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.95238095 1. 1. 0.96875 0.9375 1. 0.96666667 0.98412698 0.93333333] mean value: 0.9726884920634921 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.9375 1. 1. 0.93939394 0.90909091 1. 1. 0.96875 0.93333333] mean value: 0.9656818181818182 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 1. 1. 0.96774194 1. 0.93548387 1. 0.93333333] mean value: 0.9804301075268818 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96875 0.92137097 1. 1. 0.9375 0.89012097 1. 0.96774194 0.96666667 0.90416667] mean value: 0.9556317204301076 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.90909091 1. 1. 0.93939394 0.88235294 1. 0.93548387 0.96875 0.875 ] mean value: 0.9478821660629061 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.05 Accuracy on Blind test: 0.19 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04805613 0.04128599 0.06816578 0.05686855 0.07966805 0.05619812 0.08012176 0.05640578 0.06545734 0.05767536] mean value: 0.06099028587341308 key: score_time value: [0.01277566 0.01246333 0.01240373 0.03590488 0.02149439 0.01272893 0.02091289 0.01216292 0.02337241 0.01243329] mean value: 0.01766524314880371 key: test_mcc value: [0.81952077 0.71025956 1. 0.91188882 0.90524194 0.66337469 0.81048387 0.85513419 0.70322581 0.67015231] mean value: 0.8049281962952315 key: train_mcc value: [0.95736701 0.96269263 0.95736701 0.95199661 0.95199661 0.95736701 0.95736701 0.96823254 0.95756757 0.96812026] mean value: 0.9590074282899193 key: test_accuracy value: [0.91489362 0.87234043 1. 0.95744681 0.95744681 0.85106383 0.91489362 0.93478261 0.86956522 0.84782609] mean value: 0.9120259019426458 key: train_accuracy value: [0.98095238 0.98333333 0.98095238 0.97857143 0.97857143 0.98095238 0.98095238 0.98574822 0.98099762 0.98574822] mean value: 0.9816779776043434 key: test_fscore value: [0.93333333 0.90625 1. 0.96666667 0.96774194 0.89552239 0.93548387 0.95081967 0.90322581 0.88135593] mean value: 0.9340399605297465 key: train_fscore value: [0.98571429 0.98747764 0.98571429 0.98389982 0.98389982 0.98571429 0.98571429 0.98928571 0.98571429 0.98932384] mean value: 0.9862458267132189 key: test_precision value: [0.96551724 0.87878788 1. 1. 0.96774194 0.83333333 0.93548387 0.96666667 0.90322581 0.89655172] mean value: 0.9347308457208346 key: train_precision value: [0.9787234 0.98220641 0.9787234 0.97864769 0.97864769 0.9787234 0.9787234 0.9822695 0.9787234 0.98233216] mean value: 0.9797720459659157 key: test_recall value: [0.90322581 0.93548387 1. 0.93548387 0.96774194 0.96774194 0.93548387 0.93548387 0.90322581 0.86666667] mean value: 0.9350537634408602 key: train_recall value: [0.99280576 0.99280576 0.99280576 0.98920863 0.98920863 0.99280576 0.99280576 0.99640288 0.99280576 0.99641577] mean value: 0.9928070446868312 key: test_roc_auc value: [0.9203629 0.84274194 1. 0.96774194 0.95262097 0.79637097 0.90524194 0.9344086 0.8516129 0.83958333] mean value: 0.9010685483870968 key: train_roc_auc value: [0.97527612 0.97879724 0.97527612 0.97347756 0.97347756 0.97527612 0.97527612 0.98071892 0.97542386 0.98060225] mean value: 0.9763601853986702 key: test_jcc value: [0.875 0.82857143 1. 0.93548387 0.9375 0.81081081 0.87878788 0.90625 0.82352941 0.78787879] mean value: 0.8783812188781354 key: train_jcc value: [0.97183099 0.97526502 0.97183099 0.96830986 0.96830986 0.97183099 0.97183099 0.97879859 0.97183099 0.97887324] mean value: 0.9728711491564227 MCC on Blind test: 0.13 Accuracy on Blind test: 0.37 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02241898 0.01315236 0.01071215 0.01068687 0.00957108 0.00949717 0.00947452 0.00976682 0.00959682 0.00956392] mean value: 0.01144406795501709 key: score_time value: [0.01186442 0.01024771 0.00938439 0.00939918 0.00869703 0.0087316 0.00866318 0.00860071 0.00864935 0.00871134] mean value: 0.009294891357421875 key: test_mcc value: [0.63478467 0.56769924 0.51389369 0.65994312 0.62096774 0.66337469 0.47137482 0.64852426 0.44695591 0.76764947] mean value: 0.5995167617624015 key: train_mcc value: [0.59337085 0.6486968 0.6639652 0.6261021 0.62393742 0.6523944 0.65818223 0.62211627 0.6562151 0.62652246] mean value: 0.6371502844408112 key: test_accuracy value: [0.82978723 0.80851064 0.78723404 0.85106383 0.82978723 0.85106383 0.76595745 0.84782609 0.76086957 0.89130435] mean value: 0.8223404255319149 key: train_accuracy value: [0.82142857 0.8452381 0.85238095 0.83571429 0.83571429 0.84761905 0.85 0.83372922 0.847981 0.83610451] mean value: 0.8405909964936094 key: test_fscore value: [0.86666667 0.85714286 0.84375 0.89230769 0.87096774 0.89552239 0.82539683 0.88888889 0.82539683 0.92307692] mean value: 0.8689116808871864 key: train_fscore value: [0.86818981 0.88536155 0.89122807 0.87873462 0.88 0.88811189 0.88966725 0.87719298 0.8869258 0.87915937] mean value: 0.8824571336612159 key: test_precision value: [0.89655172 0.84375 0.81818182 0.85294118 0.87096774 0.83333333 0.8125 0.875 0.8125 0.85714286] mean value: 0.8472868651202012 key: train_precision value: [0.84879725 0.86851211 0.86986301 0.85910653 0.85185185 0.86394558 0.8668942 0.85616438 0.87152778 0.85958904] mean value: 0.8616251734964677 key: test_recall value: [0.83870968 0.87096774 0.87096774 0.93548387 0.87096774 0.96774194 0.83870968 0.90322581 0.83870968 1. ] mean value: 0.8935483870967742 key: train_recall value: [0.88848921 0.9028777 0.91366906 0.89928058 0.91007194 0.91366906 0.91366906 0.89928058 0.9028777 0.89964158] mean value: 0.9043526469147263 key: test_roc_auc value: [0.82560484 0.77923387 0.74798387 0.81149194 0.81048387 0.79637097 0.73185484 0.81827957 0.71935484 0.84375 ] mean value: 0.7884408602150538 key: train_roc_auc value: [0.78931503 0.81763603 0.82303172 0.80527409 0.80010639 0.81598946 0.81951059 0.80278714 0.82206822 0.80545459] mean value: 0.8101173261166756 key: test_jcc value: [0.76470588 0.75 0.72972973 0.80555556 0.77142857 0.81081081 0.7027027 0.8 0.7027027 0.85714286] mean value: 0.7694778812425871 key: train_jcc value: [0.76708075 0.7943038 0.80379747 0.78369906 0.78571429 0.79874214 0.80126183 0.78125 0.7968254 0.784375 ] mean value: 0.7897049721282987 MCC on Blind test: 0.22 Accuracy on Blind test: 0.5 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01426268 0.02361226 0.01946807 0.02516937 0.01924849 0.02545094 0.02460933 0.02311087 0.01817918 0.02519631] mean value: 0.02183074951171875 key: score_time value: [0.00876379 0.01119065 0.01151466 0.01174283 0.0117209 0.01172209 0.01216221 0.01199412 0.01224375 0.01191258] mean value: 0.011496758460998536 key: test_mcc value: [0.86070252 0.71572581 0.76032282 0.95299692 0.81952077 0.71206211 0.8084425 0.69956858 0.43161973 0.80651412] mean value: 0.7567475880486182 key: train_mcc value: [0.82321411 0.9627116 0.89833067 0.93097611 0.8239525 0.95734993 0.89402196 0.84253494 0.86216499 0.94174218] mean value: 0.8936998993308193 key: test_accuracy value: [0.93617021 0.87234043 0.89361702 0.9787234 0.91489362 0.87234043 0.91489362 0.84782609 0.76086957 0.91304348] mean value: 0.8904717853839038 key: train_accuracy value: [0.91904762 0.98333333 0.9547619 0.96904762 0.91428571 0.98095238 0.95238095 0.9216152 0.93824228 0.97387173] mean value: 0.9507538739961543 key: test_fscore value: [0.95384615 0.90322581 0.92307692 0.98412698 0.93333333 0.90909091 0.9375 0.87719298 0.83076923 0.93548387] mean value: 0.9187646194119029 key: train_fscore value: [0.94237288 0.98743268 0.96625222 0.97657658 0.93207547 0.98566308 0.96503497 0.9373814 0.95470383 0.98059965] mean value: 0.9628092756589914 key: test_precision value: [0.91176471 0.90322581 0.88235294 0.96875 0.96551724 0.85714286 0.90909091 0.96153846 0.79411765 0.90625 ] mean value: 0.9059750569720798 key: train_precision value: [0.89102564 0.98566308 0.95438596 0.97833935 0.98015873 0.98214286 0.93877551 0.99196787 0.92567568 0.96527778] mean value: 0.959341246100077 key: test_recall value: [1. 0.90322581 0.96774194 1. 0.90322581 0.96774194 0.96774194 0.80645161 0.87096774 0.96666667] mean value: 0.9353763440860215 key: train_recall value: [1. 0.98920863 0.97841727 0.97482014 0.88848921 0.98920863 0.99280576 0.88848921 0.98561151 0.99641577] mean value: 0.968346613032155 key: test_roc_auc value: [0.90625 0.8578629 0.85887097 0.96875 0.9203629 0.82762097 0.89012097 0.86989247 0.70215054 0.88958333] mean value: 0.8691465053763441 key: train_roc_auc value: [0.88028169 0.98051981 0.94343399 0.96628331 0.92663897 0.97699868 0.9330226 0.9372516 0.91588268 0.96299662] mean value: 0.942330993899117 key: test_jcc value: [0.91176471 0.82352941 0.85714286 0.96875 0.875 0.83333333 0.88235294 0.78125 0.71052632 0.87878788] mean value: 0.8522437443877072 key: train_jcc value: [0.89102564 0.9751773 0.9347079 0.95422535 0.87279152 0.97173145 0.93243243 0.88214286 0.91333333 0.96193772] mean value: 0.9289505509252404 MCC on Blind test: 0.16 Accuracy on Blind test: 0.29 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01881099 0.01925945 0.01931882 0.02046704 0.01870441 0.01791883 0.01955533 0.02114439 0.01979399 0.02047253] mean value: 0.019544577598571776 key: score_time value: [0.01178288 0.01173759 0.01165867 0.01172376 0.01175308 0.01169109 0.01165581 0.01170611 0.011729 0.01172876] mean value: 0.011716675758361817 key: test_mcc value: [0.90524194 0.8084425 0.56329266 0.90524194 0.90662544 0.66337469 0.68913865 0.85513419 0.69721252 0.81348922] mean value: 0.7807193745773244 key: train_mcc value: [0.95199661 0.92638558 0.75648156 0.93060457 0.92557595 0.8841752 0.89514141 0.94728132 0.87441314 0.89469123] mean value: 0.898674658892361 key: test_accuracy value: [0.95744681 0.91489362 0.80851064 0.95744681 0.95744681 0.85106383 0.85106383 0.93478261 0.86956522 0.91304348] mean value: 0.9015263644773358 key: train_accuracy value: [0.97857143 0.96666667 0.88809524 0.96904762 0.96666667 0.94761905 0.95238095 0.97624703 0.94299287 0.95249406] mean value: 0.9540781585793462 key: test_fscore value: [0.96774194 0.9375 0.86956522 0.96774194 0.96875 0.89552239 0.88135593 0.95081967 0.90909091 0.9375 ] mean value: 0.9285587989844194 key: train_fscore value: [0.98389982 0.9754386 0.92205638 0.97674419 0.97526502 0.96180556 0.96363636 0.98194946 0.95847751 0.96527778] mean value: 0.966455067016163 key: test_precision value: [0.96774194 0.90909091 0.78947368 0.96774194 0.93939394 0.83333333 0.92857143 0.96666667 0.85714286 0.88235294] mean value: 0.9041509630553873 key: train_precision value: [0.97864769 0.95205479 0.85538462 0.97153025 0.95833333 0.9295302 0.97426471 0.98550725 0.92333333 0.93602694] mean value: 0.9464613102143273 key: test_recall value: [0.96774194 0.96774194 0.96774194 0.96774194 1. 0.96774194 0.83870968 0.93548387 0.96774194 1. ] mean value: 0.9580645161290323 key: train_recall value: [0.98920863 1. 1. 0.98201439 0.99280576 0.99640288 0.95323741 0.97841727 0.99640288 0.99641577] mean value: 0.9884904979242413 key: test_roc_auc value: [0.95262097 0.89012097 0.73387097 0.95262097 0.9375 0.79637097 0.85685484 0.9344086 0.8172043 0.875 ] mean value: 0.8746572580645161 key: train_roc_auc value: [0.97347756 0.95070423 0.83450704 0.96283818 0.95414936 0.92425778 0.95197082 0.97522262 0.91778186 0.93130648] mean value: 0.9376215909300119 key: test_jcc value: [0.9375 0.88235294 0.76923077 0.9375 0.93939394 0.81081081 0.78787879 0.90625 0.83333333 0.88235294] mean value: 0.8686603523000582 key: train_jcc value: [0.96830986 0.95205479 0.85538462 0.95454545 0.95172414 0.9264214 0.92982456 0.96453901 0.92026578 0.93288591] mean value: 0.9355955521485729 MCC on Blind test: 0.15 Accuracy on Blind test: 0.34 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.17678094 0.16169643 0.16195178 0.16259241 0.16251469 0.16290808 0.16300702 0.1635623 0.16283536 0.16640735] mean value: 0.16442563533782958 key: score_time value: [0.01511002 0.01516867 0.01555228 0.01531434 0.01529408 0.01527691 0.01535225 0.01529121 0.01536965 0.01541305] mean value: 0.015314245223999023 key: test_mcc value: [0.95299692 0.8566725 1. 1. 0.90662544 0.81503725 0.91188882 0.95250095 0.95087679 0.85513419] mean value: 0.9201732856062123 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9787234 0.93617021 1. 1. 0.95744681 0.91489362 0.95744681 0.97826087 0.97826087 0.93478261] mean value: 0.9635985198889917 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.95238095 1. 1. 0.96875 0.93939394 0.96666667 0.98360656 0.98412698 0.95081967] mean value: 0.9729871756203723 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.9375 1. 1. 0.93939394 0.88571429 1. 1. 0.96875 0.93548387] mean value: 0.9635592096075967 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 1. 1. 1. 0.93548387 0.96774194 1. 0.96666667] mean value: 0.983763440860215 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96875 0.92137097 1. 1. 0.9375 0.875 0.96774194 0.98387097 0.96666667 0.92083333] mean value: 0.9541733870967742 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.90909091 1. 1. 0.93939394 0.88571429 0.93548387 0.96774194 0.96875 0.90625 ] mean value: 0.9481174940650747 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.06447458 0.06540036 0.06229663 0.08273578 0.07696605 0.0617063 0.0600605 0.06062365 0.06386662 0.07482314] mean value: 0.06729536056518555 key: score_time value: [0.03304315 0.03502083 0.03025556 0.03909683 0.02619481 0.04146361 0.02470899 0.02245951 0.03020692 0.02624965] mean value: 0.03086998462677002 key: test_mcc value: [0.95299692 0.8566725 0.95299692 1. 0.90662544 0.81503725 0.87213027 0.95250095 0.95087679 0.77787176] mean value: 0.9037708791663533 key: train_mcc value: [0.97879832 1. 0.99468526 0.98945277 0.97870346 0.98938023 0.99470349 0.98950083 0.98940987 0.98946562] mean value: 0.9894099843463138 key: test_accuracy value: [0.9787234 0.93617021 0.9787234 1. 0.95744681 0.91489362 0.93617021 0.97826087 0.97826087 0.89130435] mean value: 0.954995374653099 key: train_accuracy value: [0.99047619 1. 0.99761905 0.9952381 0.99047619 0.9952381 0.99761905 0.99524941 0.99524941 0.99524941] mean value: 0.9952414885193983 key: test_fscore value: [0.98412698 0.95238095 0.98412698 1. 0.96875 0.93939394 0.94915254 0.98360656 0.98412698 0.9122807 ] mean value: 0.965794564566016 key: train_fscore value: [0.99285714 1. 0.99820467 0.99638989 0.99283154 0.99641577 0.9981982 0.99638989 0.99640288 0.99640288] mean value: 0.9964092859536038 key: test_precision value: [0.96875 0.9375 0.96875 1. 0.93939394 0.88571429 1. 1. 0.96875 0.96296296] mean value: 0.9631821188071188 key: train_precision value: [0.9858156 1. 0.99641577 1. 0.98928571 0.99285714 1. 1. 0.99640288 1. ] mean value: 0.9960777108286898 key: test_recall value: [1. 0.96774194 1. 1. 1. 1. 0.90322581 0.96774194 1. 0.86666667] mean value: 0.9705376344086022 key: train_recall value: [1. 1. 1. 0.99280576 0.99640288 1. 0.99640288 0.99280576 0.99640288 0.99283154] mean value: 0.996765168510353 key: test_roc_auc value: [0.96875 0.92137097 0.96875 1. 0.9375 0.875 0.9516129 0.98387097 0.96666667 0.90208333] mean value: 0.9475604838709677 key: train_roc_auc value: [0.98591549 1. 0.99647887 0.99640288 0.98763806 0.99295775 0.99820144 0.99640288 0.99470494 0.99641577] mean value: 0.9945118071449628 key: test_jcc value: [0.96875 0.90909091 0.96875 1. 0.93939394 0.88571429 0.90322581 0.96774194 0.96875 0.83870968] mean value: 0.9350126553553972 key: train_jcc value: [0.9858156 1. 0.99641577 0.99280576 0.98576512 0.99285714 0.99640288 0.99280576 0.99283154 0.99283154] mean value: 0.9928531111784986 MCC on Blind test: 0.05 Accuracy on Blind test: 0.19 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.17185569 0.10447001 0.11902738 0.11162257 0.14436507 0.15038991 0.14318299 0.14804506 0.16088104 0.14451122] mean value: 0.13983509540557862 key: score_time value: [0.02366495 0.01449943 0.01455712 0.02399635 0.02356625 0.02375555 0.02349377 0.02348185 0.02342916 0.02345991] mean value: 0.021790432929992675 key: test_mcc value: [0.6139232 0.44917734 0.56329266 0.60908698 0.71206211 0.39449818 0.34522561 0.53722882 0.19552949 0.55533018] mean value: 0.4975354557722252 key: train_mcc value: [0.95773996 0.93680867 0.94725945 0.94203047 0.94203047 0.96825224 0.93680867 0.9527212 0.9527212 0.95778798] mean value: 0.9494160325679993 key: test_accuracy value: [0.82978723 0.76595745 0.80851064 0.82978723 0.87234043 0.74468085 0.72340426 0.80434783 0.67391304 0.80434783] mean value: 0.7857076780758557 key: train_accuracy value: [0.98095238 0.97142857 0.97619048 0.97380952 0.97380952 0.98571429 0.97142857 0.97862233 0.97862233 0.98099762] mean value: 0.9771575613618368 key: test_fscore value: [0.88235294 0.84057971 0.86956522 0.87878788 0.90909091 0.82352941 0.80597015 0.86153846 0.7761194 0.85714286] mean value: 0.8504676939276321 key: train_fscore value: [0.9858156 0.97887324 0.98233216 0.98059965 0.98059965 0.98932384 0.97887324 0.9840708 0.9840708 0.98586572] mean value: 0.9830424692438128 key: test_precision value: [0.81081081 0.76315789 0.78947368 0.82857143 0.85714286 0.75675676 0.75 0.82352941 0.72222222 0.81818182] mean value: 0.7919846884397969 key: train_precision value: [0.97202797 0.95862069 0.96527778 0.96193772 0.96193772 0.97887324 0.95862069 0.96864111 0.96864111 0.97212544] mean value: 0.9666703466583892 key: test_recall value: [0.96774194 0.93548387 0.96774194 0.93548387 0.96774194 0.90322581 0.87096774 0.90322581 0.83870968 0.9 ] mean value: 0.9190322580645162 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.76512097 0.68649194 0.73387097 0.78024194 0.82762097 0.6703629 0.65423387 0.7516129 0.58602151 0.7625 ] mean value: 0.7218077956989247 key: train_roc_auc value: [0.97183099 0.95774648 0.96478873 0.96126761 0.96126761 0.97887324 0.95774648 0.96853147 0.96853147 0.97183099] mean value: 0.9662415049738993 key: test_jcc value: [0.78947368 0.725 0.76923077 0.78378378 0.83333333 0.7 0.675 0.75675676 0.63414634 0.75 ] mean value: 0.7416724668778584 key: train_jcc value: [0.97202797 0.95862069 0.96527778 0.96193772 0.96193772 0.97887324 0.95862069 0.96864111 0.96864111 0.97212544] mean value: 0.9666703466583892 MCC on Blind test: 0.15 Accuracy on Blind test: 0.35 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.63199282 0.64057088 0.6330502 0.62381983 0.62570333 0.63321471 0.6462667 0.64404368 0.64453101 0.61528659] mean value: 0.633847975730896 key: score_time value: [0.00964928 0.01027107 0.00939989 0.01029348 0.01047158 0.00946736 0.01022267 0.010427 0.00941825 0.00946999] mean value: 0.0099090576171875 key: test_mcc value: [0.95299692 0.8566725 1. 0.95436677 0.90662544 0.81503725 1. 0.95250095 0.95087679 0.7125 ] mean value: 0.9101576622344475 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9787234 0.93617021 1. 0.9787234 0.95744681 0.91489362 1. 0.97826087 0.97826087 0.86956522] mean value: 0.959204440333025 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.95238095 1. 0.98360656 0.96875 0.93939394 1. 0.98360656 0.98412698 0.9 ] mean value: 0.9695991974782958 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.9375 1. 1. 0.93939394 0.88571429 1. 1. 0.96875 0.9 ] mean value: 0.9600108225108225 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 1. 1. 1. 0.96774194 1. 0.9 ] mean value: 0.9803225806451613 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96875 0.92137097 1. 0.98387097 0.9375 0.875 1. 0.98387097 0.96666667 0.85625 ] mean value: 0.9493279569892473 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.90909091 1. 0.96774194 0.93939394 0.88571429 1. 0.96774194 0.96875 0.81818182] mean value: 0.9425364823348694 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03014874 0.03859639 0.02921605 0.02919149 0.02927637 0.02877808 0.02914238 0.0289011 0.02861476 0.03231573] mean value: 0.03041810989379883 key: score_time value: [0.01298356 0.02056313 0.01420951 0.01520872 0.01517725 0.015167 0.01596665 0.01756454 0.0154233 0.01316094] mean value: 0.01554245948791504 key: test_mcc value: [-0.00390816 -0.18759162 -0.10225003 0.33463647 0.42453805 -0.18759162 -0.00572561 -0.03648678 -0.05008953 0.0382546 ] mean value: 0.022378577955517495 key: train_mcc value: [0.2960748 0.28737578 0.2960748 0.28737578 0.26927519 0.32099733 0.32896374 0.30312249 0.29467148 0.28753566] mean value: 0.2971467060046902 key: test_accuracy value: [0.63829787 0.59574468 0.59574468 0.72340426 0.74468085 0.59574468 0.61702128 0.60869565 0.63043478 0.63043478] mean value: 0.6380203515263645 key: train_accuracy value: [0.7047619 0.70238095 0.7047619 0.70238095 0.69761905 0.71190476 0.71428571 0.70546318 0.70308789 0.70308789] mean value: 0.7049734192964596 key: test_fscore value: [0.77333333 0.74666667 0.73972603 0.82191781 0.83783784 0.74666667 0.75 0.74285714 0.76712329 0.76056338] mean value: 0.7686692150931008 key: train_fscore value: [0.81764706 0.8164464 0.81764706 0.8164464 0.81405564 0.82127031 0.82248521 0.81764706 0.8164464 0.81698389] mean value: 0.8177075432290432 key: test_precision value: [0.65909091 0.63636364 0.64285714 0.71428571 0.72093023 0.63636364 0.65853659 0.66666667 0.66666667 0.65853659] mean value: 0.6660297775584219 key: train_precision value: [0.69154229 0.6898263 0.69154229 0.6898263 0.68641975 0.69674185 0.69849246 0.69154229 0.6898263 0.69059406] mean value: 0.6916353903300737 key: test_recall value: [0.93548387 0.90322581 0.87096774 0.96774194 1. 0.90322581 0.87096774 0.83870968 0.90322581 0.9 ] mean value: 0.9093548387096774 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.49899194 0.4516129 0.46673387 0.60887097 0.625 0.4516129 0.49798387 0.48602151 0.48494624 0.5125 ] mean value: 0.5084274193548387 key: train_roc_auc value: [0.56338028 0.55985915 0.56338028 0.55985915 0.5528169 0.57394366 0.57746479 0.56643357 0.56293706 0.55985915] mean value: 0.563993400965232 key: test_jcc value: [0.63043478 0.59574468 0.58695652 0.69767442 0.72093023 0.59574468 0.6 0.59090909 0.62222222 0.61363636] mean value: 0.6254252993980421 key: train_jcc value: [0.69154229 0.6898263 0.69154229 0.6898263 0.68641975 0.69674185 0.69849246 0.69154229 0.6898263 0.69059406] mean value: 0.6916353903300737 MCC on Blind test: 0.04 Accuracy on Blind test: 0.15 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02983284 0.03181815 0.03235269 0.03182912 0.03201151 0.031914 0.03235412 0.03964448 0.04482198 0.0336349 ] mean value: 0.03402137756347656 key: score_time value: [0.02541828 0.02288342 0.02519679 0.02281809 0.02517581 0.02304912 0.02497959 0.025208 0.02368593 0.02489567] mean value: 0.024331068992614745 key: test_mcc value: [0.90662544 0.8084425 0.90662544 0.95299692 0.90662544 0.6139232 0.81048387 0.80215054 0.59332241 0.7073172 ] mean value: 0.8008512976244041 key: train_mcc value: [0.93614376 0.93614376 0.91503448 0.93085643 0.93085643 0.93085643 0.92030205 0.93149626 0.92041993 0.94174218] mean value: 0.9293851718860523 key: test_accuracy value: [0.95744681 0.91489362 0.95744681 0.9787234 0.95744681 0.82978723 0.91489362 0.91304348 0.82608696 0.86956522] mean value: 0.9119333950046253 key: train_accuracy value: [0.97142857 0.97142857 0.96190476 0.96904762 0.96904762 0.96904762 0.96428571 0.96912114 0.96437055 0.97387173] mean value: 0.968355389661803 key: test_fscore value: [0.96875 0.9375 0.96875 0.98412698 0.96875 0.88235294 0.93548387 0.93548387 0.875 0.90322581] mean value: 0.9359423473690551 key: train_fscore value: [0.9787234 0.9787234 0.97183099 0.97699115 0.97699115 0.97699115 0.97354497 0.97707231 0.97345133 0.98059965] mean value: 0.9764919504404125 key: test_precision value: [0.93939394 0.90909091 0.93939394 0.96875 0.93939394 0.81081081 0.93548387 0.93548387 0.84848485 0.875 ] mean value: 0.910128612850387 key: train_precision value: [0.96503497 0.96503497 0.95172414 0.96167247 0.96167247 0.96167247 0.9550173 0.95847751 0.95818815 0.96527778] mean value: 0.9603772230380215 key: test_recall value: [1. 0.96774194 1. 1. 1. 0.96774194 0.93548387 0.93548387 0.90322581 0.93333333] mean value: 0.9643010752688173 key: train_recall value: [0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99640288 0.98920863 0.99641577] mean value: 0.993166756917047 key: test_roc_auc value: [0.9375 0.89012097 0.9375 0.96875 0.9375 0.76512097 0.90524194 0.90107527 0.78494624 0.84166667] mean value: 0.8869422043010753 key: train_roc_auc value: [0.96119161 0.96119161 0.9471071 0.95767048 0.95767048 0.95767048 0.95062823 0.9562434 0.95264627 0.96299662] mean value: 0.9565016292218447 key: test_jcc value: [0.93939394 0.88235294 0.93939394 0.96875 0.93939394 0.78947368 0.87878788 0.87878788 0.77777778 0.82352941] mean value: 0.8817641390687057 key: train_jcc value: [0.95833333 0.95833333 0.94520548 0.9550173 0.9550173 0.9550173 0.94845361 0.95517241 0.94827586 0.96193772] mean value: 0.9540763649605376 MCC on Blind test: 0.2 Accuracy on Blind test: 0.43 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.27491093 0.27264166 0.27324533 0.27130818 0.28760481 0.28644633 0.33207846 0.28306985 0.2727232 0.27460504] mean value: 0.2828633785247803 key: score_time value: [0.02479482 0.02375317 0.02303386 0.02578378 0.0250814 0.02236199 0.02544141 0.02499795 0.02385545 0.02501225] mean value: 0.024411606788635253 key: test_mcc value: [0.86091836 0.71025956 0.95299692 0.95436677 0.90662544 0.81503725 0.8566725 0.81245565 0.74844698 0.75776742] mean value: 0.8375546868374703 key: train_mcc value: [0.95204958 0.95204958 0.94674008 0.94131391 0.94131391 0.95204958 0.94674008 0.95769694 0.9469923 0.96812026] mean value: 0.9505066231143429 key: test_accuracy value: [0.93617021 0.87234043 0.9787234 0.9787234 0.95744681 0.91489362 0.93617021 0.91304348 0.89130435 0.89130435] mean value: 0.9270120259019426 key: train_accuracy value: [0.97857143 0.97857143 0.97619048 0.97380952 0.97380952 0.97857143 0.97619048 0.98099762 0.97624703 0.98574822] mean value: 0.977870715982355 key: test_fscore value: [0.95081967 0.90625 0.98412698 0.98360656 0.96875 0.93939394 0.95238095 0.93333333 0.92307692 0.91803279] mean value: 0.9459771148705575 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:114: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:117: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.98395722 0.98395722 0.98220641 0.98039216 0.98039216 0.98395722 0.98220641 0.98576512 0.98220641 0.98932384] mean value: 0.9834364156532882 key: test_precision value: [0.96666667 0.87878788 0.96875 1. 0.93939394 0.88571429 0.9375 0.96551724 0.88235294 0.90322581] mean value: 0.9327908759570165 key: train_precision value: [0.97526502 0.97526502 0.97183099 0.97173145 0.97173145 0.97526502 0.97183099 0.97535211 0.97183099 0.98233216] mean value: 0.9742435176429602 key: test_recall value: [0.93548387 0.93548387 1. 0.96774194 1. 1. 0.96774194 0.90322581 0.96774194 0.93333333] mean value: 0.9610752688172043 key: train_recall value: [0.99280576 0.99280576 0.99280576 0.98920863 0.98920863 0.99280576 0.99280576 0.99640288 0.99280576 0.99641577] mean value: 0.9928070446868312 key: test_roc_auc value: [0.93649194 0.84274194 0.96875 0.98387097 0.9375 0.875 0.92137097 0.91827957 0.85053763 0.87291667] mean value: 0.9107459677419355 key: train_roc_auc value: [0.97175499 0.97175499 0.96823386 0.9664353 0.9664353 0.97175499 0.96823386 0.97372591 0.96843085 0.98060225] mean value: 0.9707362318873928 key: test_jcc value: [0.90625 0.82857143 0.96875 0.96774194 0.93939394 0.88571429 0.90909091 0.875 0.85714286 0.84848485] mean value: 0.8986140203882139 key: train_jcc value: [0.96842105 0.96842105 0.96503497 0.96153846 0.96153846 0.96842105 0.96503497 0.97192982 0.96503497 0.97887324] mean value: 0.9674248040074578 MCC on Blind test: 0.16 Accuracy on Blind test: 0.41 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03594327 0.03823495 0.03745127 0.03787708 0.03765678 0.03796601 0.03775644 0.04696417 0.07437563 0.07606626] mean value: 0.046029186248779295 key: score_time value: [0.01225424 0.01527619 0.01508021 0.01502061 0.01554465 0.01481938 0.01508856 0.02117872 0.01212502 0.01207471] mean value: 0.014846229553222656 key: test_mcc value: [0.96824584 0.74348441 0.80813523 0.84266484 0.87096774 0.87096774 0.77459667 0.84266484 0.70537634 0.80516731] mean value: 0.8232270968732767 key: train_mcc value: [0.88143754 0.8705036 0.87070641 0.86695696 0.85646981 0.85265591 0.87070641 0.86366703 0.86042111 0.86714973] mean value: 0.8660674512400568 key: test_accuracy value: [0.98387097 0.87096774 0.90322581 0.91935484 0.93548387 0.93548387 0.88709677 0.91935484 0.85245902 0.90163934] mean value: 0.9108937070333156 key: train_accuracy value: [0.94064748 0.9352518 0.9352518 0.93345324 0.92805755 0.92625899 0.9352518 0.93165468 0.92998205 0.93357271] mean value: 0.9329382095759657 key: test_fscore value: [0.98412698 0.875 0.90625 0.92307692 0.93548387 0.93548387 0.8852459 0.91525424 0.85245902 0.90322581] mean value: 0.9115606610911926 key: train_fscore value: [0.94117647 0.9352518 0.93594306 0.93381038 0.92907801 0.92691622 0.93594306 0.93262411 0.93097345 0.93381038] mean value: 0.9335526941508385 key: test_precision value: [0.96875 0.84848485 0.87878788 0.88235294 0.93548387 0.93548387 0.9 0.96428571 0.86666667 0.875 ] mean value: 0.9055295791337062 key: train_precision value: [0.93286219 0.9352518 0.92605634 0.92882562 0.91608392 0.91872792 0.92605634 0.91958042 0.91637631 0.93214286] mean value: 0.9251963702827759 key: test_recall value: [1. 0.90322581 0.93548387 0.96774194 0.93548387 0.93548387 0.87096774 0.87096774 0.83870968 0.93333333] mean value: 0.9191397849462366 key: train_recall value: [0.94964029 0.9352518 0.94604317 0.93884892 0.94244604 0.9352518 0.94604317 0.94604317 0.94604317 0.93548387] mean value: 0.9421095381759109 key: test_roc_auc value: [0.98387097 0.87096774 0.90322581 0.91935484 0.93548387 0.93548387 0.88709677 0.91935484 0.85268817 0.90215054] mean value: 0.9109677419354839 key: train_roc_auc value: [0.94064748 0.9352518 0.9352518 0.93345324 0.92805755 0.92625899 0.9352518 0.93165468 0.93001083 0.93356927] mean value: 0.9329407441788504 key: test_jcc value: [0.96875 0.77777778 0.82857143 0.85714286 0.87878788 0.87878788 0.79411765 0.84375 0.74285714 0.82352941] mean value: 0.8394072022748493 key: train_jcc value: [0.88888889 0.87837838 0.87959866 0.87583893 0.86754967 0.86378738 0.87959866 0.87375415 0.87086093 0.87583893] mean value: 0.8754094568296669 MCC on Blind test: 0.2 Accuracy on Blind test: 0.51 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.91726518 0.8967886 1.0972681 0.90807962 1.03316021 0.91765285 1.06892157 0.88056183 0.8810854 0.9651711 ] mean value: 0.9565954446792603 key: score_time value: [0.01491475 0.01541138 0.0154171 0.0152595 0.01555252 0.01543021 0.01531744 0.01531911 0.02811074 0.01530552] mean value: 0.016603827476501465 key: test_mcc value: [0.87096774 0.87278605 0.96824584 0.93743687 0.90369611 0.90369611 0.93743687 0.90748521 0.77096774 0.8688172 ] mean value: 0.8941535747226808 key: train_mcc value: [0.98561151 0.97841727 1. 0.98202074 1. 0.97841727 0.97124816 0.99640932 0.97845594 1. ] mean value: 0.9870580210483318 key: test_accuracy value: [0.93548387 0.93548387 0.98387097 0.96774194 0.9516129 0.9516129 0.96774194 0.9516129 0.8852459 0.93442623] mean value: 0.9464833421470121 key: train_accuracy value: [0.99280576 0.98920863 1. 0.99100719 1. 0.98920863 0.98561151 0.99820144 0.98922801 1. ] mean value: 0.9935271172648954 key: test_fscore value: [0.93548387 0.9375 0.98360656 0.96875 0.95081967 0.95238095 0.96666667 0.94915254 0.8852459 0.93333333] mean value: 0.9462939496869116 key: train_fscore value: [0.99280576 0.98920863 1. 0.99099099 1. 0.98920863 0.98555957 0.9981982 0.98920863 1. ] mean value: 0.9935180410652452 key: test_precision value: [0.93548387 0.90909091 1. 0.93939394 0.96666667 0.9375 1. 1. 0.9 0.93333333] mean value: 0.952146871945259 key: train_precision value: [0.99280576 0.98920863 1. 0.99277978 1. 0.98920863 0.98913043 1. 0.98920863 1. ] mean value: 0.9942341872852369 key: test_recall value: [0.93548387 0.96774194 0.96774194 1. 0.93548387 0.96774194 0.93548387 0.90322581 0.87096774 0.93333333] mean value: 0.9417204301075268 key: train_recall value: [0.99280576 0.98920863 1. 0.98920863 1. 0.98920863 0.98201439 0.99640288 0.98920863 1. ] mean value: 0.9928057553956835 key: test_roc_auc value: [0.93548387 0.93548387 0.98387097 0.96774194 0.9516129 0.9516129 0.96774194 0.9516129 0.88548387 0.9344086 ] mean value: 0.946505376344086 key: train_roc_auc value: [0.99280576 0.98920863 1. 0.99100719 1. 0.98920863 0.98561151 0.99820144 0.98922797 1. ] mean value: 0.9935271137928368 key: test_jcc value: [0.87878788 0.88235294 0.96774194 0.93939394 0.90625 0.90909091 0.93548387 0.90322581 0.79411765 0.875 ] mean value: 0.8991444928411247 key: train_jcc value: [0.98571429 0.97864769 1. 0.98214286 1. 0.97864769 0.97153025 0.99640288 0.97864769 1. ] mean value: 0.9871733330163526 MCC on Blind test: 0.13 Accuracy on Blind test: 0.43 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01596975 0.01088405 0.01051188 0.01025772 0.01047158 0.01051545 0.01049328 0.07161856 0.01046991 0.01163387] mean value: 0.017282605171203613 key: score_time value: [0.01237273 0.00940228 0.00911212 0.00896454 0.00897431 0.00900674 0.0089376 0.00923395 0.00951004 0.00992632] mean value: 0.009544062614440917 key: test_mcc value: [0.51639778 0.54953196 0.64820372 0.59603956 0.55301004 0.75623534 0.5483871 0.64820372 0.52020635 0.64178842] mean value: 0.5978003999385197 key: train_mcc value: [0.58619138 0.63442478 0.64790132 0.63414469 0.61309946 0.62982654 0.61973231 0.60119024 0.60212461 0.64171147] mean value: 0.6210346806411129 key: test_accuracy value: [0.75806452 0.77419355 0.82258065 0.79032258 0.77419355 0.87096774 0.77419355 0.82258065 0.75409836 0.81967213] mean value: 0.7960867265996827 key: train_accuracy value: [0.78956835 0.81654676 0.82374101 0.81654676 0.8057554 0.8147482 0.80935252 0.80035971 0.80071813 0.82046679] mean value: 0.8097803624246025 key: test_fscore value: [0.75409836 0.76666667 0.83076923 0.8115942 0.78787879 0.88235294 0.77419355 0.81355932 0.7826087 0.80701754] mean value: 0.8010739299978262 key: train_fscore value: [0.80467446 0.82229965 0.82685512 0.82167832 0.8125 0.81769912 0.81468531 0.8042328 0.80492091 0.82517483] mean value: 0.8154720527371425 key: test_precision value: [0.76666667 0.79310345 0.79411765 0.73684211 0.74285714 0.81081081 0.77419355 0.85714286 0.71052632 0.85185185] mean value: 0.7838112394103743 key: train_precision value: [0.75077882 0.7972973 0.8125 0.79931973 0.7852349 0.80487805 0.79251701 0.78892734 0.78694158 0.80546075] mean value: 0.7923855463549293 key: test_recall value: [0.74193548 0.74193548 0.87096774 0.90322581 0.83870968 0.96774194 0.77419355 0.77419355 0.87096774 0.76666667] mean value: 0.8250537634408602 key: train_recall value: [0.86690647 0.84892086 0.84172662 0.84532374 0.84172662 0.83093525 0.8381295 0.82014388 0.82374101 0.84587814] mean value: 0.8403432093035249 key: test_roc_auc value: [0.75806452 0.77419355 0.82258065 0.79032258 0.77419355 0.87096774 0.77419355 0.82258065 0.75215054 0.8188172 ] mean value: 0.7958064516129032 key: train_roc_auc value: [0.78956835 0.81654676 0.82374101 0.81654676 0.8057554 0.8147482 0.80935252 0.80035971 0.80075939 0.82042108] mean value: 0.8097799180010831 key: test_jcc value: [0.60526316 0.62162162 0.71052632 0.68292683 0.65 0.78947368 0.63157895 0.68571429 0.64285714 0.67647059] mean value: 0.6696432572959795 key: train_jcc value: [0.67318436 0.69822485 0.70481928 0.69732938 0.68421053 0.69161677 0.68731563 0.67256637 0.67352941 0.70238095] mean value: 0.6885177526404157 MCC on Blind test: 0.16 Accuracy on Blind test: 0.48 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01177359 0.01193714 0.01188874 0.01178646 0.01160216 0.01176119 0.01168275 0.0110743 0.01171422 0.01068854] mean value: 0.011590909957885743 key: score_time value: [0.010288 0.00987935 0.00976849 0.00973058 0.00971985 0.00978112 0.0097537 0.00968814 0.00973296 0.00953555] mean value: 0.009787774085998536 key: test_mcc value: [0.67883359 0.64549722 0.7130241 0.77459667 0.84266484 0.67741935 0.67741935 0.74193548 0.54086022 0.67858574] mean value: 0.6970836566982328 key: train_mcc value: [0.71230395 0.73741484 0.72313855 0.73022055 0.71949894 0.73741484 0.73741484 0.72663751 0.74147134 0.72728798] mean value: 0.7292803358963176 key: test_accuracy value: [0.83870968 0.82258065 0.85483871 0.88709677 0.91935484 0.83870968 0.83870968 0.87096774 0.7704918 0.83606557] mean value: 0.8477525118984665 key: train_accuracy value: [0.85611511 0.86870504 0.86151079 0.86510791 0.85971223 0.86870504 0.86870504 0.86330935 0.87073609 0.86355476] mean value: 0.8646161347403226 key: test_fscore value: [0.83333333 0.82539683 0.86153846 0.88888889 0.91525424 0.83870968 0.83870968 0.87096774 0.77419355 0.84375 ] mean value: 0.8490742391606935 key: train_fscore value: [0.85507246 0.86894075 0.86025408 0.86535009 0.86071429 0.86894075 0.86846847 0.86281588 0.8705036 0.86231884] mean value: 0.8643379221459592 key: test_precision value: [0.86206897 0.8125 0.82352941 0.875 0.96428571 0.83870968 0.83870968 0.87096774 0.77419355 0.79411765] mean value: 0.8454082383787775 key: train_precision value: [0.86131387 0.86738351 0.86813187 0.86379928 0.85460993 0.86738351 0.8700361 0.86594203 0.8705036 0.87179487] mean value: 0.8660898573052462 key: test_recall value: [0.80645161 0.83870968 0.90322581 0.90322581 0.87096774 0.83870968 0.83870968 0.87096774 0.77419355 0.9 ] mean value: 0.854516129032258 key: train_recall value: [0.84892086 0.8705036 0.85251799 0.86690647 0.86690647 0.8705036 0.86690647 0.85971223 0.8705036 0.85304659] mean value: 0.8626427889946108 key: test_roc_auc value: [0.83870968 0.82258065 0.85483871 0.88709677 0.91935484 0.83870968 0.83870968 0.87096774 0.77043011 0.83709677] mean value: 0.8478494623655914 key: train_roc_auc value: [0.85611511 0.86870504 0.86151079 0.86510791 0.85971223 0.86870504 0.86870504 0.86330935 0.87073567 0.86357366] mean value: 0.8646179830329285 key: test_jcc value: [0.71428571 0.7027027 0.75675676 0.8 0.84375 0.72222222 0.72222222 0.77142857 0.63157895 0.72972973] mean value: 0.7394676866716341 key: train_jcc value: [0.74683544 0.76825397 0.75477707 0.76265823 0.75548589 0.76825397 0.76751592 0.75873016 0.77070064 0.75796178] mean value: 0.7611173073553839 MCC on Blind test: 0.17 Accuracy on Blind test: 0.49 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01102924 0.01101851 0.01121306 0.01111603 0.01100349 0.01029229 0.01008701 0.01012349 0.0112462 0.01112509] mean value: 0.010825443267822265 key: score_time value: [0.01412606 0.0150013 0.01282954 0.01299381 0.01446128 0.01321387 0.01302505 0.01303339 0.01337314 0.01329184] mean value: 0.013534927368164062 key: test_mcc value: [0.70116959 0.55301004 0.61807005 0.65372045 0.83914639 0.51856298 0.5809475 0.58834841 0.60818119 0.58786645] mean value: 0.624902305420752 key: train_mcc value: [0.7437841 0.75289533 0.74634949 0.74256537 0.73230817 0.72645594 0.75320817 0.74889946 0.76876369 0.73034296] mean value: 0.7445572685559485 key: test_accuracy value: [0.83870968 0.77419355 0.80645161 0.82258065 0.91935484 0.75806452 0.79032258 0.79032258 0.80327869 0.78688525] mean value: 0.8090163934426229 key: train_accuracy value: [0.8705036 0.87589928 0.87230216 0.8705036 0.86510791 0.86151079 0.87589928 0.87410072 0.88330341 0.86355476] mean value: 0.8712685506890717 key: test_fscore value: [0.81481481 0.75862069 0.79310345 0.80701754 0.91803279 0.74576271 0.79365079 0.77192982 0.8 0.75471698] mean value: 0.7957649594699424 key: train_fscore value: [0.86466165 0.87245841 0.86778399 0.866171 0.85981308 0.85444234 0.87198516 0.87132353 0.87850467 0.85714286] mean value: 0.8664286698615212 key: test_precision value: [0.95652174 0.81481481 0.85185185 0.88461538 0.93333333 0.78571429 0.78125 0.84615385 0.82758621 0.86956522] mean value: 0.8551406679901807 key: train_precision value: [0.90551181 0.8973384 0.8996139 0.89615385 0.89494163 0.90039841 0.90038314 0.89097744 0.91439689 0.90118577] mean value: 0.9000901243730935 key: test_recall value: [0.70967742 0.70967742 0.74193548 0.74193548 0.90322581 0.70967742 0.80645161 0.70967742 0.77419355 0.66666667] mean value: 0.7473118279569892 key: train_recall value: [0.82733813 0.84892086 0.8381295 0.8381295 0.82733813 0.81294964 0.84532374 0.85251799 0.84532374 0.8172043 ] mean value: 0.8353175524096852 key: test_roc_auc value: [0.83870968 0.77419355 0.80645161 0.82258065 0.91935484 0.75806452 0.79032258 0.79032258 0.80376344 0.78494624] mean value: 0.8088709677419355 key: train_roc_auc value: [0.8705036 0.87589928 0.87230216 0.8705036 0.86510791 0.86151079 0.87589928 0.87410072 0.88323535 0.86363812] mean value: 0.8712700807096259 key: test_jcc value: [0.6875 0.61111111 0.65714286 0.67647059 0.84848485 0.59459459 0.65789474 0.62857143 0.66666667 0.60606061] mean value: 0.6634497437709512 key: train_jcc value: [0.7615894 0.77377049 0.76644737 0.76393443 0.75409836 0.74587459 0.77302632 0.77198697 0.78333333 0.75 ] mean value: 0.7644061258348679 MCC on Blind test: 0.17 Accuracy on Blind test: 0.54 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.02417302 0.02532053 0.02587271 0.02416277 0.02608395 0.02859092 0.02403355 0.02894592 0.0248692 0.02357578] mean value: 0.02556283473968506 key: score_time value: [0.01271033 0.01297474 0.01253748 0.01260567 0.01325917 0.01347256 0.0124619 0.01322174 0.01214051 0.01210165] mean value: 0.01274857521057129 key: test_mcc value: [0.87278605 0.74348441 0.7190925 0.75623534 0.87096774 0.87096774 0.74193548 0.77459667 0.64178842 0.78156791] mean value: 0.7773422266927121 key: train_mcc value: [0.83715789 0.83214747 0.84053106 0.81773799 0.82321735 0.82227458 0.82610134 0.81986865 0.82643287 0.82557745] mean value: 0.8271046647580531 key: test_accuracy value: [0.93548387 0.87096774 0.85483871 0.87096774 0.93548387 0.93548387 0.87096774 0.88709677 0.81967213 0.8852459 ] mean value: 0.8866208355367531 key: train_accuracy value: [0.91726619 0.91546763 0.91906475 0.90827338 0.91007194 0.91007194 0.9118705 0.90827338 0.91202873 0.91202873] mean value: 0.9124417162858582 key: test_fscore value: [0.9375 0.875 0.86567164 0.88235294 0.93548387 0.93548387 0.87096774 0.88888889 0.83076923 0.89230769] mean value: 0.8914425878804295 key: train_fscore value: [0.92041522 0.91768827 0.9220104 0.91068301 0.9137931 0.91319444 0.91507799 0.91222031 0.91507799 0.91478261] mean value: 0.9154943347587674 key: test_precision value: [0.90909091 0.84848485 0.80555556 0.81081081 0.93548387 0.93548387 0.87096774 0.875 0.79411765 0.82857143] mean value: 0.8613566683443343 key: train_precision value: [0.88666667 0.89419795 0.88963211 0.88737201 0.87748344 0.88255034 0.88294314 0.87458746 0.88294314 0.88851351] mean value: 0.8846889778724271 key: test_recall value: [0.96774194 0.90322581 0.93548387 0.96774194 0.93548387 0.93548387 0.87096774 0.90322581 0.87096774 0.96666667] mean value: 0.9256989247311828 key: train_recall value: [0.95683453 0.94244604 0.95683453 0.9352518 0.95323741 0.94604317 0.94964029 0.95323741 0.94964029 0.94265233] mean value: 0.9485817797375004 key: test_roc_auc value: [0.93548387 0.87096774 0.85483871 0.87096774 0.93548387 0.93548387 0.87096774 0.88709677 0.8188172 0.88655914] mean value: 0.8866666666666667 key: train_roc_auc value: [0.91726619 0.91546763 0.91906475 0.90827338 0.91007194 0.91007194 0.9118705 0.90827338 0.91209613 0.91197365] mean value: 0.9124429488667131 key: test_jcc value: [0.88235294 0.77777778 0.76315789 0.78947368 0.87878788 0.87878788 0.77142857 0.8 0.71052632 0.80555556] mean value: 0.8057848498250975 key: train_jcc value: [0.8525641 0.84789644 0.85530547 0.83601286 0.84126984 0.84025559 0.84345048 0.83860759 0.84345048 0.84294872] mean value: 0.8441761574343863 MCC on Blind test: 0.23 Accuracy on Blind test: 0.47 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.69970942 2.01441073 2.02653909 2.06026721 2.04769754 2.04136062 1.92918682 1.99944806 1.96888709 2.03387642] mean value: 1.9821382999420165 key: score_time value: [0.01246619 0.02080941 0.01246428 0.0147903 0.0124898 0.02321792 0.01491332 0.02058625 0.01244164 0.01501036] mean value: 0.015918946266174315 key: test_mcc value: [0.87278605 0.84266484 0.80813523 0.87278605 0.90369611 0.87096774 0.80645161 0.90748521 0.67204301 0.8688172 ] mean value: 0.8425833063747562 key: train_mcc value: [0.99280576 0.98921503 0.99280576 0.98921503 0.98921503 0.99640932 0.99280576 0.99640932 0.98923428 1. ] mean value: 0.9928115293600994 key: test_accuracy value: [0.93548387 0.91935484 0.90322581 0.93548387 0.9516129 0.93548387 0.90322581 0.9516129 0.83606557 0.93442623] mean value: 0.920597567424643 key: train_accuracy value: [0.99640288 0.99460432 0.99640288 0.99460432 0.99460432 0.99820144 0.99640288 0.99820144 0.994614 1. ] mean value: 0.9964038464022319 key: test_fscore value: [0.9375 0.92307692 0.90625 0.9375 0.95238095 0.93548387 0.90322581 0.94915254 0.83870968 0.93333333] mean value: 0.9216613106002799 key: train_fscore value: [0.99640288 0.99459459 0.99640288 0.99459459 0.99459459 0.99820467 0.99640288 0.99820467 0.99459459 1. ] mean value: 0.9963996347199013 key: test_precision value: [0.90909091 0.88235294 0.87878788 0.90909091 0.9375 0.93548387 0.90322581 1. 0.83870968 0.93333333] mean value: 0.912757532631821 key: train_precision value: [0.99640288 0.99638989 0.99640288 0.99638989 0.99638989 0.99641577 0.99640288 0.99641577 0.99638989 1. ] mean value: 0.9967599741099167 key: test_recall value: [0.96774194 0.96774194 0.93548387 0.96774194 0.96774194 0.93548387 0.90322581 0.90322581 0.83870968 0.93333333] mean value: 0.9320430107526881 key: train_recall value: [0.99640288 0.99280576 0.99640288 0.99280576 0.99280576 1. 0.99640288 1. 0.99280576 1. ] mean value: 0.996043165467626 key: test_roc_auc value: [0.93548387 0.91935484 0.90322581 0.93548387 0.9516129 0.93548387 0.90322581 0.9516129 0.83602151 0.9344086 ] mean value: 0.9205913978494624 key: train_roc_auc value: [0.99640288 0.99460432 0.99640288 0.99460432 0.99460432 0.99820144 0.99640288 0.99820144 0.99461076 1. ] mean value: 0.9964035223434156 key: test_jcc value: [0.88235294 0.85714286 0.82857143 0.88235294 0.90909091 0.87878788 0.82352941 0.90322581 0.72222222 0.875 ] mean value: 0.8562276396384556 key: train_jcc value: [0.99283154 0.98924731 0.99283154 0.98924731 0.98924731 0.99641577 0.99283154 0.99641577 0.98924731 1. ] mean value: 0.992831541218638 MCC on Blind test: 0.18 Accuracy on Blind test: 0.53 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.03310966 0.02370453 0.02852869 0.024019 0.02490878 0.02375698 0.0259831 0.02331567 0.02899504 0.02387166] mean value: 0.02601931095123291 key: score_time value: [0.01206684 0.00918984 0.00911832 0.00887275 0.00887132 0.00887656 0.00896263 0.00905371 0.00915074 0.00887108] mean value: 0.00930337905883789 key: test_mcc value: [0.93548387 0.93548387 1. 0.87278605 0.96824584 0.96824584 0.87096774 0.90748521 0.93649139 0.83655914] mean value: 0.9231748950840374 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.96774194 0.96774194 1. 0.93548387 0.98387097 0.98387097 0.93548387 0.9516129 0.96721311 0.91803279] mean value: 0.9611052353252247 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.96774194 0.96774194 1. 0.9375 0.98360656 0.98360656 0.93548387 0.94915254 0.96666667 0.91803279] mean value: 0.9609532852614375 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96774194 0.96774194 1. 0.90909091 1. 1. 0.93548387 1. 1. 0.90322581] mean value: 0.9683284457478005 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.96774194 1. 0.96774194 0.96774194 0.96774194 0.93548387 0.90322581 0.93548387 0.93333333] mean value: 0.9546236559139785 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96774194 0.96774194 1. 0.93548387 0.98387097 0.98387097 0.93548387 0.9516129 0.96774194 0.91827957] mean value: 0.9611827956989247 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9375 0.9375 1. 0.88235294 0.96774194 0.96774194 0.87878788 0.90322581 0.93548387 0.84848485] mean value: 0.9258819216836295 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.12 Accuracy on Blind test: 0.68 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.130229 0.126683 0.12822556 0.12735248 0.12751722 0.12969017 0.12745762 0.12832069 0.12567377 0.12686229] mean value: 0.12780117988586426 key: score_time value: [0.01785779 0.01803756 0.017869 0.01788187 0.01802921 0.01807475 0.01809239 0.01788807 0.01800728 0.0179708 ] mean value: 0.017970871925354005 key: test_mcc value: [0.93548387 0.80813523 0.80813523 0.83914639 0.93548387 0.83914639 0.74348441 0.93743687 0.77072165 0.90215054] mean value: 0.8519324452598774 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.96774194 0.90322581 0.90322581 0.91935484 0.96774194 0.91935484 0.87096774 0.96774194 0.8852459 0.95081967] mean value: 0.9255420412480169 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.96774194 0.90625 0.90625 0.92063492 0.96774194 0.91803279 0.875 0.96666667 0.88888889 0.95081967] mean value: 0.9268026806174612 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96774194 0.87878788 0.87878788 0.90625 0.96774194 0.93333333 0.84848485 1. 0.875 0.93548387] mean value: 0.9191611681329424 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 0.93548387 0.93548387 0.96774194 0.90322581 0.90322581 0.93548387 0.90322581 0.96666667] mean value: 0.9353763440860214 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.96774194 0.90322581 0.90322581 0.91935484 0.96774194 0.91935484 0.87096774 0.96774194 0.88494624 0.95107527] mean value: 0.9255376344086022 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9375 0.82857143 0.82857143 0.85294118 0.9375 0.84848485 0.77777778 0.93548387 0.8 0.90625 ] mean value: 0.8653080530843814 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.15 Accuracy on Blind test: 0.41 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01067162 0.01046467 0.01075888 0.01043582 0.01073885 0.01057148 0.0104928 0.01046562 0.01066518 0.0106113 ] mean value: 0.010587620735168456 key: score_time value: [0.00888348 0.00908494 0.00885534 0.00883126 0.00887513 0.00885701 0.00883675 0.00886393 0.0088141 0.00885773] mean value: 0.00887596607208252 key: test_mcc value: [0.74348441 0.64820372 0.45374261 0.74193548 0.58834841 0.54953196 0.61418277 0.61807005 0.57419355 0.64708149] mean value: 0.6178774447532813 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.87096774 0.82258065 0.72580645 0.87096774 0.79032258 0.77419355 0.80645161 0.80645161 0.78688525 0.81967213] mean value: 0.8074299312533051 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.875 0.83076923 0.71186441 0.87096774 0.80597015 0.76666667 0.8125 0.79310345 0.78688525 0.8 ] mean value: 0.8053726889582276 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.84848485 0.79411765 0.75 0.87096774 0.75 0.79310345 0.78787879 0.85185185 0.8 0.88 ] mean value: 0.8126404325485658 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.90322581 0.87096774 0.67741935 0.87096774 0.87096774 0.74193548 0.83870968 0.74193548 0.77419355 0.73333333] mean value: 0.8023655913978495 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.87096774 0.82258065 0.72580645 0.87096774 0.79032258 0.77419355 0.80645161 0.80645161 0.78709677 0.81827957] mean value: 0.8073118279569893 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.77777778 0.71052632 0.55263158 0.77142857 0.675 0.62162162 0.68421053 0.65714286 0.64864865 0.66666667] mean value: 0.6765654564338774 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.15 Accuracy on Blind test: 0.48 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [2.01657772 2.0090251 2.03078365 2.00115323 2.0330627 2.06364393 2.02089286 2.09506607 2.07742858 2.05084038] mean value: 2.039847421646118 key: score_time value: [0.09266973 0.09811974 0.09791493 0.0990932 0.09410357 0.09961629 0.09475183 0.10066032 0.10026288 0.10025406] mean value: 0.09774465560913086 key: test_mcc value: [0.96824584 0.87278605 0.93743687 0.84266484 0.96824584 1. 0.83914639 0.96824584 0.93635873 0.83655914] mean value: 0.9169689529423093 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.93548387 0.96774194 0.91935484 0.98387097 1. 0.91935484 0.98387097 0.96721311 0.91803279] mean value: 0.9578794288736119 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.9375 0.96875 0.92307692 0.98412698 1. 0.92063492 0.98360656 0.96875 0.91803279] mean value: 0.9588605156228107 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.90909091 0.93939394 0.88235294 0.96875 1. 0.90625 1. 0.93939394 0.90322581] mean value: 0.9417207535506872 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 1. 1. 0.93548387 0.96774194 1. 0.93333333] mean value: 0.9772043010752688 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.93548387 0.96774194 0.91935484 0.98387097 1. 0.91935484 0.98387097 0.96666667 0.91827957] mean value: 0.9578494623655914 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.88235294 0.93939394 0.85714286 0.96875 1. 0.85294118 0.96774194 0.93939394 0.84848485] mean value: 0.9224951637546515 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.13 Accuracy on Blind test: 0.37 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000...05', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.00179338 1.07502389 1.05377007 0.99681354 1.01473498 1.02150321 0.98504329 1.10019326 1.03762698 1.07313776] mean value: 1.035964035987854 key: score_time value: [0.27896857 0.23714018 0.17314553 0.26825953 0.26656032 0.25020838 0.22471714 0.17120504 0.24397707 0.22602963] mean value: 0.23402113914489747 key: test_mcc value: [1. 0.83914639 0.93743687 0.87831007 0.93743687 0.96824584 0.87096774 0.96824584 0.93635873 0.83655914] mean value: 0.9172707478226112 key: train_mcc value: [0.96778244 0.96778244 0.96778244 0.97487691 0.97132357 0.97487691 0.97132357 0.97851856 0.96080787 0.97502162] mean value: 0.9710096336881678 key: test_accuracy value: [1. 0.91935484 0.96774194 0.93548387 0.96774194 0.98387097 0.93548387 0.98387097 0.96721311 0.91803279] mean value: 0.9578794288736119 key: train_accuracy value: [0.98381295 0.98381295 0.98381295 0.98741007 0.98561151 0.98741007 0.98561151 0.98920863 0.98025135 0.98743268] mean value: 0.9854374669026 key: test_fscore value: [1. 0.92063492 0.96875 0.93939394 0.96875 0.98412698 0.93548387 0.98360656 0.96875 0.91803279] mean value: 0.9587529059385881 key: train_fscore value: [0.98395722 0.98395722 0.98395722 0.98747764 0.98571429 0.98747764 0.98571429 0.98928571 0.98046181 0.98756661] mean value: 0.9855569639932104 key: test_precision value: [1. 0.90625 0.93939394 0.88571429 0.93939394 0.96875 0.93548387 1. 0.93939394 0.90322581] mean value: 0.9417605781315459 key: train_precision value: [0.97526502 0.97526502 0.97526502 0.98220641 0.9787234 0.98220641 0.9787234 0.9822695 0.96842105 0.97887324] mean value: 0.977721846851637 key: test_recall value: [1. 0.93548387 1. 1. 1. 1. 0.93548387 0.96774194 1. 0.93333333] mean value: 0.9772043010752688 key: train_recall value: [0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99640288 0.99280576 0.99641577] mean value: 0.9935264691472628 key: test_roc_auc value: [1. 0.91935484 0.96774194 0.93548387 0.96774194 0.98387097 0.93548387 0.98387097 0.96666667 0.91827957] mean value: 0.9578494623655914 key: train_roc_auc value: [0.98381295 0.98381295 0.98381295 0.98741007 0.98561151 0.98741007 0.98561151 0.98920863 0.98027385 0.98741652] mean value: 0.9854381011319977 key: test_jcc value: [1. 0.85294118 0.93939394 0.88571429 0.93939394 0.96875 0.87878788 0.96774194 0.93939394 0.84848485] mean value: 0.922060194312329 key: train_jcc value: [0.96842105 0.96842105 0.96842105 0.97526502 0.97183099 0.97526502 0.97183099 0.97879859 0.96167247 0.9754386 ] mean value: 0.9715364821992674 MCC on Blind test: 0.17 Accuracy on Blind test: 0.41 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02595091 0.01143193 0.01075721 0.0116322 0.01162744 0.01118588 0.01176763 0.01181078 0.01117778 0.01176858] mean value: 0.012911033630371094 key: score_time value: [0.01068473 0.00950789 0.00947976 0.00941181 0.00950432 0.00955606 0.00921035 0.00912285 0.00968337 0.0091567 ] mean value: 0.009531784057617187 key: test_mcc value: [0.67883359 0.64549722 0.7130241 0.77459667 0.84266484 0.67741935 0.67741935 0.74193548 0.54086022 0.67858574] mean value: 0.6970836566982328 key: train_mcc value: [0.71230395 0.73741484 0.72313855 0.73022055 0.71949894 0.73741484 0.73741484 0.72663751 0.74147134 0.72728798] mean value: 0.7292803358963176 key: test_accuracy value: [0.83870968 0.82258065 0.85483871 0.88709677 0.91935484 0.83870968 0.83870968 0.87096774 0.7704918 0.83606557] mean value: 0.8477525118984665 key: train_accuracy value: [0.85611511 0.86870504 0.86151079 0.86510791 0.85971223 0.86870504 0.86870504 0.86330935 0.87073609 0.86355476] mean value: 0.8646161347403226 key: test_fscore value: [0.83333333 0.82539683 0.86153846 0.88888889 0.91525424 0.83870968 0.83870968 0.87096774 0.77419355 0.84375 ] mean value: 0.8490742391606935 key: train_fscore value: [0.85507246 0.86894075 0.86025408 0.86535009 0.86071429 0.86894075 0.86846847 0.86281588 0.8705036 0.86231884] mean value: 0.8643379221459592 key: test_precision value: [0.86206897 0.8125 0.82352941 0.875 0.96428571 0.83870968 0.83870968 0.87096774 0.77419355 0.79411765] mean value: 0.8454082383787775 key: train_precision value: [0.86131387 0.86738351 0.86813187 0.86379928 0.85460993 0.86738351 0.8700361 0.86594203 0.8705036 0.87179487] mean value: 0.8660898573052462 key: test_recall value: [0.80645161 0.83870968 0.90322581 0.90322581 0.87096774 0.83870968 0.83870968 0.87096774 0.77419355 0.9 ] mean value: 0.854516129032258 key: train_recall value: [0.84892086 0.8705036 0.85251799 0.86690647 0.86690647 0.8705036 0.86690647 0.85971223 0.8705036 0.85304659] mean value: 0.8626427889946108 key: test_roc_auc value: [0.83870968 0.82258065 0.85483871 0.88709677 0.91935484 0.83870968 0.83870968 0.87096774 0.77043011 0.83709677] mean value: 0.8478494623655914 key: train_roc_auc value: [0.85611511 0.86870504 0.86151079 0.86510791 0.85971223 0.86870504 0.86870504 0.86330935 0.87073567 0.86357366] mean value: 0.8646179830329285 key: test_jcc value: [0.71428571 0.7027027 0.75675676 0.8 0.84375 0.72222222 0.72222222 0.77142857 0.63157895 0.72972973] mean value: 0.7394676866716341 key: train_jcc value: [0.74683544 0.76825397 0.75477707 0.76265823 0.75548589 0.76825397 0.76751592 0.75873016 0.77070064 0.75796178] mean value: 0.7611173073553839 MCC on Blind test: 0.17 Accuracy on Blind test: 0.49 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.08813882 0.076159 0.07938814 0.12023497 0.13540554 0.10756207 0.07338667 0.0774343 0.07803369 0.07466531] mean value: 0.09104084968566895 key: score_time value: [0.01114702 0.01104593 0.01131225 0.01271844 0.01222229 0.01108837 0.01109123 0.01365948 0.01104116 0.01136756] mean value: 0.011669373512268067 key: test_mcc value: [0.96824584 0.93548387 0.96824584 0.87831007 0.96824584 0.93743687 1. 0.96824584 1. 0.8688172 ] mean value: 0.9493031353691006 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.96774194 0.98387097 0.93548387 0.98387097 0.96774194 1. 0.98387097 1. 0.93442623] mean value: 0.9740877842411423 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.96774194 0.98412698 0.93939394 0.98412698 0.96666667 1. 0.98360656 1. 0.93333333] mean value: 0.9743123384635811 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.96774194 0.96875 0.88571429 0.96875 1. 1. 1. 1. 0.93333333] mean value: 0.969303955453149 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 1. 1. 0.93548387 1. 0.96774194 1. 0.93333333] mean value: 0.9804301075268818 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.96774194 0.98387097 0.93548387 0.98387097 0.96774194 1. 0.98387097 1. 0.9344086 ] mean value: 0.9740860215053764 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.9375 0.96875 0.88571429 0.96875 0.93548387 1. 0.96774194 1. 0.875 ] mean value: 0.9507690092165899 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.14 Accuracy on Blind test: 0.67 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04738498 0.06282306 0.06595659 0.04460764 0.08294272 0.07518077 0.05843568 0.09132147 0.07124734 0.05581951] mean value: 0.06557197570800781 key: score_time value: [0.01911306 0.01217628 0.02016044 0.01214337 0.01833034 0.01263642 0.01912284 0.02297115 0.0121851 0.01308465] mean value: 0.016192364692687988 key: test_mcc value: [0.83914639 0.83914639 0.90748521 0.81325006 0.90369611 0.96824584 0.96824584 0.90748521 0.9344086 0.80475071] mean value: 0.8885860367733198 key: train_mcc value: [0.96425338 0.96768225 0.97132357 0.97124816 0.96412858 0.96768225 0.96412858 0.96778244 0.97137553 0.98210326] mean value: 0.9691707996843315 key: test_accuracy value: [0.91935484 0.91935484 0.9516129 0.90322581 0.9516129 0.98387097 0.98387097 0.9516129 0.96721311 0.90163934] mean value: 0.9433368588048652 key: train_accuracy value: [0.98201439 0.98381295 0.98561151 0.98561151 0.98201439 0.98381295 0.98201439 0.98381295 0.98563734 0.99102334] mean value: 0.9845365718197435 key: test_fscore value: [0.92063492 0.91803279 0.95384615 0.90909091 0.95238095 0.98360656 0.98360656 0.94915254 0.96774194 0.89655172] mean value: 0.9434645039586963 key: train_fscore value: [0.98220641 0.98389982 0.98571429 0.98566308 0.98214286 0.98389982 0.98214286 0.98395722 0.98571429 0.99108734] mean value: 0.9846427979343616 key: test_precision value: [0.90625 0.93333333 0.91176471 0.85714286 0.9375 1. 1. 1. 0.96774194 0.92857143] mean value: 0.9442304260413843 key: train_precision value: [0.97183099 0.97864769 0.9787234 0.98214286 0.9751773 0.97864769 0.9751773 0.97526502 0.9787234 0.9858156 ] mean value: 0.978015125566827 key: test_recall value: [0.93548387 0.90322581 1. 0.96774194 0.96774194 0.96774194 0.96774194 0.90322581 0.96774194 0.86666667] mean value: 0.9447311827956989 key: train_recall value: [0.99280576 0.98920863 0.99280576 0.98920863 0.98920863 0.98920863 0.98920863 0.99280576 0.99280576 0.99641577] mean value: 0.9913681957659679 key: test_roc_auc value: [0.91935484 0.91935484 0.9516129 0.90322581 0.9516129 0.98387097 0.98387097 0.9516129 0.9672043 0.90107527] mean value: 0.9432795698924732 key: train_roc_auc value: [0.98201439 0.98381295 0.98561151 0.98561151 0.98201439 0.98381295 0.98201439 0.98381295 0.98565019 0.99101364] mean value: 0.9845368866197365 key: test_jcc value: [0.85294118 0.84848485 0.91176471 0.83333333 0.90909091 0.96774194 0.96774194 0.90322581 0.9375 0.8125 ] mean value: 0.8944324650681387 key: train_jcc value: [0.96503497 0.96830986 0.97183099 0.97173145 0.96491228 0.96830986 0.96491228 0.96842105 0.97183099 0.98233216] mean value: 0.9697625873451181 MCC on Blind test: 0.16 Accuracy on Blind test: 0.42 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.0249753 0.01080441 0.01006317 0.0100286 0.01024675 0.01045632 0.01094842 0.01116395 0.01117063 0.01019931] mean value: 0.01200568675994873 key: score_time value: [0.00920653 0.00890374 0.00869942 0.00879359 0.00906634 0.00899959 0.00966144 0.00934982 0.00933647 0.0087049 ] mean value: 0.009072184562683105 key: test_mcc value: [0.74348441 0.61290323 0.67741935 0.7130241 0.74193548 0.84266484 0.54953196 0.67883359 0.50975101 0.74460444] mean value: 0.6814152411688326 key: train_mcc value: [0.66968894 0.7019886 0.70900474 0.68709037 0.69093363 0.70166132 0.70180672 0.69093363 0.70939248 0.66609934] mean value: 0.6928599755400829 key: test_accuracy value: [0.87096774 0.80645161 0.83870968 0.85483871 0.87096774 0.91935484 0.77419355 0.83870968 0.75409836 0.86885246] mean value: 0.8397144368059228 key: train_accuracy value: [0.83453237 0.85071942 0.85431655 0.84352518 0.84532374 0.85071942 0.85071942 0.84532374 0.8545781 0.83303411] mean value: 0.8462792064373635 key: test_fscore value: [0.86666667 0.80645161 0.83870968 0.86153846 0.87096774 0.92307692 0.76666667 0.84375 0.76923077 0.875 ] mean value: 0.8422058519437552 key: train_fscore value: [0.83802817 0.85361552 0.85663717 0.84436494 0.84751773 0.85257549 0.85309735 0.84751773 0.85612789 0.8342246 ] mean value: 0.8483706574660165 key: test_precision value: [0.89655172 0.80645161 0.83870968 0.82352941 0.87096774 0.88235294 0.79310345 0.81818182 0.73529412 0.82352941] mean value: 0.8288671905206617 key: train_precision value: [0.82068966 0.83737024 0.84320557 0.83985765 0.83566434 0.84210526 0.83972125 0.83566434 0.84561404 0.82978723] mean value: 0.836967958151763 key: test_recall value: [0.83870968 0.80645161 0.83870968 0.90322581 0.87096774 0.96774194 0.74193548 0.87096774 0.80645161 0.93333333] mean value: 0.8578494623655915 key: train_recall value: [0.85611511 0.8705036 0.8705036 0.84892086 0.85971223 0.86330935 0.86690647 0.85971223 0.86690647 0.83870968] mean value: 0.8601299605476909 key: test_roc_auc value: [0.87096774 0.80645161 0.83870968 0.85483871 0.87096774 0.91935484 0.77419355 0.83870968 0.75322581 0.86989247] mean value: 0.8397311827956989 key: train_roc_auc value: [0.83453237 0.85071942 0.85431655 0.84352518 0.84532374 0.85071942 0.85071942 0.84532374 0.85460019 0.8330239 ] mean value: 0.8462803950388076 key: test_jcc value: [0.76470588 0.67567568 0.72222222 0.75675676 0.77142857 0.85714286 0.62162162 0.72972973 0.625 0.77777778] mean value: 0.7302061094708153 key: train_jcc value: [0.72121212 0.74461538 0.74922601 0.73065015 0.73538462 0.74303406 0.74382716 0.73538462 0.7484472 0.71559633] mean value: 0.7367377649053004 MCC on Blind test: 0.19 Accuracy on Blind test: 0.54 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.02157235 0.02386403 0.02440476 0.0301435 0.02720165 0.02603769 0.02656388 0.02754545 0.03107619 0.03502512] mean value: 0.02734346389770508 key: score_time value: [0.01022172 0.01124573 0.01181793 0.01186442 0.01206994 0.01182413 0.01185656 0.01193142 0.0118835 0.01200509] mean value: 0.011672043800354004 key: test_mcc value: [0.96824584 0.87096774 0.90369611 0.87831007 0.90369611 0.78446454 0.87278605 0.90369611 0.77072165 0.8688172 ] mean value: 0.8725401431630221 key: train_mcc value: [0.97132357 0.93585746 0.93644001 0.96412858 0.88878772 0.87468815 0.94305636 0.91941603 0.92561092 0.97492135] mean value: 0.9334230161428768 key: test_accuracy value: [0.98387097 0.93548387 0.9516129 0.93548387 0.9516129 0.88709677 0.93548387 0.9516129 0.8852459 0.93442623] mean value: 0.935193019566367 key: train_accuracy value: [0.98561151 0.9676259 0.9676259 0.98201439 0.94244604 0.93345324 0.97122302 0.95863309 0.96229803 0.98743268] mean value: 0.9658363793704713 key: test_fscore value: [0.98360656 0.93548387 0.95238095 0.93939394 0.95081967 0.89552239 0.9375 0.95081967 0.88888889 0.93333333] mean value: 0.9367749274663901 key: train_fscore value: [0.98550725 0.96703297 0.96842105 0.98214286 0.93962264 0.9376054 0.97173145 0.96 0.96309315 0.98752228] mean value: 0.9662679037256826 key: test_precision value: [1. 0.93548387 0.9375 0.88571429 0.96666667 0.83333333 0.90909091 0.96666667 0.875 0.93333333] mean value: 0.9242789065772936 key: train_precision value: [0.99270073 0.98507463 0.94520548 0.9751773 0.98809524 0.88253968 0.95486111 0.92929293 0.94158076 0.9822695 ] mean value: 0.9576797361808079 key: test_recall value: [0.96774194 0.93548387 0.96774194 1. 0.93548387 0.96774194 0.96774194 0.93548387 0.90322581 0.93333333] mean value: 0.9513978494623656 key: train_recall value: [0.97841727 0.94964029 0.99280576 0.98920863 0.89568345 1. 0.98920863 0.99280576 0.98561151 0.99283154] mean value: 0.9766212836182667 key: test_roc_auc value: [0.98387097 0.93548387 0.9516129 0.93548387 0.9516129 0.88709677 0.93548387 0.9516129 0.88494624 0.9344086 ] mean value: 0.9351612903225808 key: train_roc_auc value: [0.98561151 0.9676259 0.9676259 0.98201439 0.94244604 0.93345324 0.97122302 0.95863309 0.96233981 0.98742296] mean value: 0.9658395863953998 key: test_jcc value: [0.96774194 0.87878788 0.90909091 0.88571429 0.90625 0.81081081 0.88235294 0.90625 0.8 0.875 ] mean value: 0.8821998761064226 key: train_jcc value: [0.97142857 0.93617021 0.93877551 0.96491228 0.886121 0.88253968 0.94501718 0.92307692 0.92881356 0.97535211] mean value: 0.9352207031286927 MCC on Blind test: 0.15 Accuracy on Blind test: 0.35 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.02023649 0.02507305 0.02207947 0.02166748 0.02319336 0.02123189 0.02448797 0.02378654 0.0200305 0.02373862] mean value: 0.02255253791809082 key: score_time value: [0.0118804 0.01188922 0.01187921 0.01177955 0.01184487 0.01185608 0.01191831 0.01188183 0.01187134 0.01185822] mean value: 0.011865901947021484 key: test_mcc value: [0.90748521 0.7284928 0.63960215 0.87831007 0.96824584 0.90369611 0.83914639 0.90369611 0.50305191 0.83638369] mean value: 0.8108110283179271 key: train_mcc value: [0.95025527 0.86017051 0.64681322 0.93958474 0.95683453 0.94305636 0.94305636 0.88357094 0.54525121 0.90616067] mean value: 0.8574753821621066 key: test_accuracy value: [0.9516129 0.85483871 0.79032258 0.93548387 0.98387097 0.9516129 0.91935484 0.9516129 0.70491803 0.91803279] mean value: 0.8961660497091486 key: train_accuracy value: [0.97482014 0.92625899 0.79496403 0.96942446 0.97841727 0.97122302 0.97122302 0.93884892 0.72890485 0.95152603] mean value: 0.9205610735827855 key: test_fscore value: [0.95384615 0.83636364 0.82666667 0.93939394 0.98412698 0.95238095 0.92063492 0.95238095 0.775 0.91525424] mean value: 0.9056048443082341 key: train_fscore value: [0.97526502 0.92100193 0.82985075 0.97001764 0.97841727 0.97173145 0.97173145 0.94217687 0.7864215 0.94953271] mean value: 0.929614657143809 key: test_precision value: [0.91176471 0.95833333 0.70454545 0.88571429 0.96875 0.9375 0.90625 0.9375 0.63265306 0.93103448] mean value: 0.8774045323458537 key: train_precision value: [0.95833333 0.99170124 0.70918367 0.95155709 0.97841727 0.95486111 0.95486111 0.89354839 0.64801865 0.9921875 ] mean value: 0.90326693685663 key: test_recall value: [1. 0.74193548 1. 1. 1. 0.96774194 0.93548387 0.96774194 1. 0.9 ] mean value: 0.9512903225806452 key: train_recall value: [0.99280576 0.85971223 1. 0.98920863 0.97841727 0.98920863 0.98920863 0.99640288 1. 0.91039427] mean value: 0.9705358294009954 key: test_roc_auc value: [0.9516129 0.85483871 0.79032258 0.93548387 0.98387097 0.9516129 0.91935484 0.9516129 0.7 0.91774194] mean value: 0.8956451612903226 key: train_roc_auc value: [0.97482014 0.92625899 0.79496403 0.96942446 0.97841727 0.97122302 0.97122302 0.93884892 0.72939068 0.95160001] mean value: 0.9206170547433021 key: test_jcc value: [0.91176471 0.71875 0.70454545 0.88571429 0.96875 0.90909091 0.85294118 0.90909091 0.63265306 0.84375 ] mean value: 0.8337050502018989 key: train_jcc value: [0.95172414 0.85357143 0.70918367 0.94178082 0.95774648 0.94501718 0.94501718 0.89067524 0.64801865 0.90391459] mean value: 0.8746649384947602 MCC on Blind test: 0.16 Accuracy on Blind test: 0.36 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.24026704 0.22439814 0.2272625 0.22671032 0.22392869 0.22420788 0.22610235 0.22615767 0.22761703 0.22552967] mean value: 0.2272181272506714 key: score_time value: [0.01524711 0.01542163 0.01545 0.01531768 0.01549482 0.01556826 0.01548982 0.01542044 0.01567101 0.01548672] mean value: 0.015456748008728028 key: test_mcc value: [0.96824584 0.93548387 0.96824584 0.87278605 0.93548387 0.93548387 0.96824584 0.93743687 0.93635873 0.8688172 ] mean value: 0.932658797454636 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.96774194 0.98387097 0.93548387 0.96774194 0.96774194 0.98387097 0.96774194 0.96721311 0.93442623] mean value: 0.9659703860391328 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98412698 0.96774194 0.98412698 0.9375 0.96774194 0.96774194 0.98360656 0.96666667 0.96875 0.93333333] mean value: 0.9661336332082631 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.96875 0.96774194 0.96875 0.90909091 0.96774194 0.96774194 1. 1. 0.93939394 0.93333333] mean value: 0.9622543988269795 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 0.96774194 0.96774194 0.96774194 0.93548387 1. 0.93333333] mean value: 0.970752688172043 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.96774194 0.98387097 0.93548387 0.96774194 0.96774194 0.98387097 0.96774194 0.96666667 0.9344086 ] mean value: 0.9659139784946237 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96875 0.9375 0.96875 0.88235294 0.9375 0.9375 0.96774194 0.93548387 0.93939394 0.875 ] mean value: 0.9349972687022023 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.05 Accuracy on Blind test: 0.31 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.07099128 0.07362032 0.09242415 0.08124018 0.08642912 0.09042525 0.08972216 0.08200741 0.09961677 0.08565569] mean value: 0.08521323204040528 key: score_time value: [0.02234936 0.02337837 0.02877522 0.02001715 0.03904366 0.02948976 0.04676676 0.03126311 0.03618932 0.03704667] mean value: 0.031431937217712404 key: test_mcc value: [1. 0.90369611 0.96824584 0.87278605 0.90369611 0.93743687 0.96824584 0.96824584 0.96770777 0.80322581] mean value: 0.9293286231770456 key: train_mcc value: [0.99640932 0.98926624 0.99283145 0.99283145 0.98921503 0.98921503 0.99640932 0.99283145 0.98923428 0.99284434] mean value: 0.9921087919345165 key: test_accuracy value: [1. 0.9516129 0.98387097 0.93548387 0.9516129 0.96774194 0.98387097 0.98387097 0.98360656 0.90163934] mean value: 0.9643310417768377 key: train_accuracy value: [0.99820144 0.99460432 0.99640288 0.99640288 0.99460432 0.99460432 0.99820144 0.99640288 0.994614 0.99640934] mean value: 0.9960447799749429 key: test_fscore value: [1. 0.95081967 0.98360656 0.9375 0.95081967 0.96666667 0.98412698 0.98360656 0.98412698 0.9 ] mean value: 0.9641273093937028 key: train_fscore value: [0.9981982 0.99457505 0.99638989 0.99638989 0.99459459 0.99459459 0.99820467 0.99638989 0.99459459 0.99640288] mean value: 0.9960334247841588 key: test_precision value: [1. 0.96666667 1. 0.90909091 0.96666667 1. 0.96875 1. 0.96875 0.9 ] mean value: 0.9679924242424243 key: train_precision value: [1. 1. 1. 1. 0.99638989 0.99638989 0.99641577 1. 0.99638989 1. ] mean value: 0.9985585445699572 key: test_recall value: [1. 0.93548387 0.96774194 0.96774194 0.93548387 0.93548387 1. 0.96774194 1. 0.9 ] mean value: 0.9609677419354838 key: train_recall value: [0.99640288 0.98920863 0.99280576 0.99280576 0.99280576 0.99280576 1. 0.99280576 0.99280576 0.99283154] mean value: 0.9935277584384106 key: test_roc_auc value: [1. 0.9516129 0.98387097 0.93548387 0.9516129 0.96774194 0.98387097 0.98387097 0.98333333 0.9016129 ] mean value: 0.9643010752688173 key: train_roc_auc value: [0.99820144 0.99460432 0.99640288 0.99640288 0.99460432 0.99460432 0.99820144 0.99640288 0.99461076 0.99641577] mean value: 0.9960450994043475 key: test_jcc value: [1. 0.90625 0.96774194 0.88235294 0.90625 0.93548387 0.96875 0.96774194 0.96875 0.81818182] mean value: 0.9321502501293772 key: train_jcc value: [0.99640288 0.98920863 0.99280576 0.99280576 0.98924731 0.98924731 0.99641577 0.99280576 0.98924731 0.99283154] mean value: 0.9921018024290246 MCC on Blind test: 0.16 Accuracy on Blind test: 0.72 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.20767641 0.30310035 0.23238945 0.24465799 0.23261762 0.22742844 0.25071764 0.28770041 0.26218009 0.24865627] mean value: 0.24971246719360352 key: score_time value: [0.02684212 0.02675676 0.03252745 0.0270896 0.0277431 0.02707982 0.02681971 0.03333616 0.02641869 0.02628922] mean value: 0.028090262413024904 key: test_mcc value: [0.90748521 0.71004695 0.74348441 0.71004695 0.83914639 0.64549722 0.67741935 0.80813523 0.63978495 0.70780713] mean value: 0.7388853795423522 key: train_mcc value: [0.97841727 0.96405373 0.97132357 0.96763216 0.96768225 0.97124816 0.97844259 0.97841727 0.97129927 0.97487139] mean value: 0.9723387641137683 key: test_accuracy value: [0.9516129 0.85483871 0.87096774 0.85483871 0.91935484 0.82258065 0.83870968 0.90322581 0.81967213 0.85245902] mean value: 0.8688260179799048 key: train_accuracy value: [0.98920863 0.98201439 0.98561151 0.98381295 0.98381295 0.98561151 0.98920863 0.98920863 0.98563734 0.98743268] mean value: 0.9861559226586415 key: test_fscore value: [0.94915254 0.85245902 0.875 0.85714286 0.91803279 0.81967213 0.83870968 0.9 0.81967213 0.84210526] mean value: 0.8671946405666758 key: train_fscore value: [0.98920863 0.98194946 0.98550725 0.98378378 0.98372514 0.98555957 0.98916968 0.98920863 0.98555957 0.98747764] mean value: 0.9861149337759959 key: test_precision value: [1. 0.86666667 0.84848485 0.84375 0.93333333 0.83333333 0.83870968 0.93103448 0.83333333 0.88888889] mean value: 0.881753456421838 key: train_precision value: [0.98920863 0.98550725 0.99270073 0.98555957 0.98909091 0.98913043 0.99275362 0.98920863 0.98913043 0.98571429] mean value: 0.9888004496836691 key: test_recall value: [0.90322581 0.83870968 0.90322581 0.87096774 0.90322581 0.80645161 0.83870968 0.87096774 0.80645161 0.8 ] mean value: 0.8541935483870968 key: train_recall value: [0.98920863 0.97841727 0.97841727 0.98201439 0.97841727 0.98201439 0.98561151 0.98920863 0.98201439 0.98924731] mean value: 0.9834571052835152 key: test_roc_auc value: [0.9516129 0.85483871 0.87096774 0.85483871 0.91935484 0.82258065 0.83870968 0.90322581 0.81989247 0.8516129 ] mean value: 0.8687634408602151 key: train_roc_auc value: [0.98920863 0.98201439 0.98561151 0.98381295 0.98381295 0.98561151 0.98920863 0.98920863 0.98563085 0.98742941] mean value: 0.9861549470101338 key: test_jcc value: [0.90322581 0.74285714 0.77777778 0.75 0.84848485 0.69444444 0.72222222 0.81818182 0.69444444 0.72727273] mean value: 0.7678911232137039 key: train_jcc value: [0.97864769 0.96453901 0.97142857 0.96808511 0.96797153 0.97153025 0.97857143 0.97864769 0.97153025 0.97526502] mean value: 0.9726216533278254 MCC on Blind test: 0.17 Accuracy on Blind test: 0.51 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.9341197 0.93513346 0.9367969 0.91957927 0.92712259 0.92074847 0.92252684 0.93268156 0.93002224 0.92744589] mean value: 0.9286176919937134 key: score_time value: [0.00948143 0.00930476 0.00921249 0.00940442 0.00933552 0.00925016 0.00932121 0.00927615 0.00973344 0.00919533] mean value: 0.009351491928100586 key: test_mcc value: [1. 0.93548387 1. 0.87278605 0.96824584 0.93743687 0.96824584 0.96824584 1. 0.8688172 ] mean value: 0.9519261499663041 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.96774194 1. 0.93548387 0.98387097 0.96774194 0.98387097 0.98387097 1. 0.93442623] mean value: 0.9757006874669487 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.96774194 1. 0.9375 0.98360656 0.96666667 0.98412698 0.98360656 1. 0.93333333] mean value: 0.9756582034364953 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.96774194 1. 0.90909091 1. 1. 0.96875 1. 1. 0.93333333] mean value: 0.9778916177908114 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 0.96774194 0.93548387 1. 0.96774194 1. 0.93333333] mean value: 0.9739784946236559 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.96774194 1. 0.93548387 0.98387097 0.96774194 0.98387097 0.98387097 1. 0.9344086 ] mean value: 0.9756989247311828 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.9375 1. 0.88235294 0.96774194 0.93548387 0.96875 0.96774194 1. 0.875 ] mean value: 0.9534570683111955 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.16 Accuracy on Blind test: 0.74 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03188586 0.03635883 0.03134942 0.03185868 0.03210068 0.03209543 0.03178 0.03248978 0.0321188 0.0346837 ] mean value: 0.032672119140625 key: score_time value: [0.01282048 0.01319289 0.0133009 0.01523924 0.01509047 0.01763248 0.01769018 0.01737404 0.01742172 0.01940989] mean value: 0.015917229652404784 key: test_mcc value: [0.55301004 0.52981294 0.61290323 0.74193548 0.7130241 0.64820372 0.61807005 0.64820372 0.61090565 0.64708149] mean value: 0.6323150429115658 key: train_mcc value: [0.88433663 0.79939871 0.96813337 0.94707924 0.88509826 0.93340152 0.98561151 0.96813337 0.97855633 0.95780462] mean value: 0.9307553567934764 key: test_accuracy value: [0.77419355 0.75806452 0.80645161 0.87096774 0.85483871 0.82258065 0.80645161 0.82258065 0.80327869 0.81967213] mean value: 0.8139079851930195 key: train_accuracy value: [0.93884892 0.89208633 0.98381295 0.97302158 0.94244604 0.96582734 0.99280576 0.98381295 0.98922801 0.97845601] mean value: 0.9640345892047583 key: test_fscore value: [0.75862069 0.7826087 0.80645161 0.87096774 0.86153846 0.83076923 0.81818182 0.81355932 0.81818182 0.8 ] mean value: 0.8160879390851283 key: train_fscore value: [0.9348659 0.90163934 0.98354662 0.97237569 0.9430605 0.96684119 0.99280576 0.98354662 0.98913043 0.97802198] mean value: 0.9645834024242367 key: test_precision value: [0.81481481 0.71052632 0.80645161 0.87096774 0.82352941 0.79411765 0.77142857 0.85714286 0.77142857 0.88 ] mean value: 0.8100407544266528 key: train_precision value: [1. 0.82831325 1. 0.99622642 0.93309859 0.93898305 0.99280576 1. 0.99635036 1. ] mean value: 0.9685777430862328 key: test_recall value: [0.70967742 0.87096774 0.80645161 0.87096774 0.90322581 0.87096774 0.87096774 0.77419355 0.87096774 0.73333333] mean value: 0.8281720430107526 key: train_recall value: [0.87769784 0.98920863 0.9676259 0.94964029 0.95323741 0.99640288 0.99280576 0.9676259 0.98201439 0.95698925] mean value: 0.9633248240117583 key: test_roc_auc value: [0.77419355 0.75806452 0.80645161 0.87096774 0.85483871 0.82258065 0.80645161 0.82258065 0.80215054 0.81827957] mean value: 0.8136559139784947 key: train_roc_auc value: [0.93884892 0.89208633 0.98381295 0.97302158 0.94244604 0.96582734 0.99280576 0.98381295 0.98921508 0.97849462] mean value: 0.9640371573708775 key: test_jcc value: [0.61111111 0.64285714 0.67567568 0.77142857 0.75675676 0.71052632 0.69230769 0.68571429 0.69230769 0.66666667] mean value: 0.6905351910615068 key: train_jcc value: [0.87769784 0.82089552 0.9676259 0.94623656 0.89225589 0.93581081 0.98571429 0.9676259 0.97849462 0.95698925] mean value: 0.9329346581564345 MCC on Blind test: 0.04 Accuracy on Blind test: 0.43 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.03260899 0.03938699 0.03979325 0.03957844 0.03954554 0.04079795 0.03987575 0.03974915 0.0404129 0.0436821 ] mean value: 0.03954310417175293 key: score_time value: [0.0209713 0.01867533 0.01859689 0.01982999 0.01872468 0.01870179 0.01865554 0.01876974 0.01882648 0.02204275] mean value: 0.019379448890686036 key: test_mcc value: [0.87278605 0.90369611 0.77459667 0.84983659 0.93743687 0.93548387 0.93548387 0.93743687 0.83655914 0.8688172 ] mean value: 0.8852133236276603 key: train_mcc value: [0.95705746 0.95693359 0.96048758 0.96405373 0.94619622 0.95339163 0.93900081 0.95353974 0.95347639 0.96065614] mean value: 0.9544793279173203 key: test_accuracy value: [0.93548387 0.9516129 0.88709677 0.91935484 0.96774194 0.96774194 0.96774194 0.96774194 0.91803279 0.93442623] mean value: 0.9416975145425701 key: train_accuracy value: [0.97841727 0.97841727 0.98021583 0.98201439 0.97302158 0.97661871 0.96942446 0.97661871 0.97666068 0.98025135] mean value: 0.9771660230164163 key: test_fscore value: [0.9375 0.95238095 0.88888889 0.92537313 0.96875 0.96774194 0.96774194 0.96666667 0.91803279 0.93333333] mean value: 0.9426409633451187 key: train_fscore value: [0.97864769 0.97857143 0.980322 0.98207885 0.97326203 0.97682709 0.96969697 0.97690941 0.97682709 0.98046181] mean value: 0.9773604388336684 key: test_precision value: [0.90909091 0.9375 0.875 0.86111111 0.93939394 0.96774194 0.96774194 1. 0.93333333 0.93333333] mean value: 0.9324246497230368 key: train_precision value: [0.96830986 0.97163121 0.97508897 0.97857143 0.96466431 0.96819788 0.96113074 0.96491228 0.96819788 0.97183099] mean value: 0.9692535540709742 key: test_recall value: [0.96774194 0.96774194 0.90322581 1. 1. 0.96774194 0.96774194 0.93548387 0.90322581 0.93333333] mean value: 0.9546236559139785 key: train_recall value: [0.98920863 0.98561151 0.98561151 0.98561151 0.98201439 0.98561151 0.97841727 0.98920863 0.98561151 0.98924731] mean value: 0.9856153786648101 key: test_roc_auc value: [0.93548387 0.9516129 0.88709677 0.91935484 0.96774194 0.96774194 0.96774194 0.96774194 0.91827957 0.9344086 ] mean value: 0.9417204301075269 key: train_roc_auc value: [0.97841727 0.97841727 0.98021583 0.98201439 0.97302158 0.97661871 0.96942446 0.97661871 0.97667672 0.98023517] mean value: 0.9771660091281814 key: test_jcc value: [0.88235294 0.90909091 0.8 0.86111111 0.93939394 0.9375 0.9375 0.93548387 0.84848485 0.875 ] mean value: 0.8925917620225021 key: train_jcc value: [0.95818815 0.95804196 0.96140351 0.96478873 0.94791667 0.95470383 0.94117647 0.95486111 0.95470383 0.96167247] mean value: 0.9557456740257194 MCC on Blind test: 0.21 Accuracy on Blind test: 0.48 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.25235701 0.16581655 0.29582691 0.30781555 0.30480957 0.29897332 0.36340833 0.34521413 0.30596662 0.31356263] mean value: 0.29537506103515626 key: score_time value: [0.0121603 0.01885104 0.01891851 0.01890349 0.01882386 0.01906419 0.02405453 0.01878023 0.01878715 0.01880646] mean value: 0.01871497631072998 key: test_mcc value: [0.87278605 0.87096774 0.93743687 0.84266484 0.93743687 0.93548387 0.93548387 0.90748521 0.9344086 0.83638369] mean value: 0.9010537613218687 key: train_mcc value: [0.95705746 0.96768225 0.96768225 0.97124816 0.96058703 0.95339163 0.93900081 0.96778244 0.96784094 0.97855633] mean value: 0.9630829293630891 key: test_accuracy value: [0.93548387 0.93548387 0.96774194 0.91935484 0.96774194 0.96774194 0.96774194 0.9516129 0.96721311 0.91803279] mean value: 0.9498149127445796 key: train_accuracy value: [0.97841727 0.98381295 0.98381295 0.98561151 0.98021583 0.97661871 0.96942446 0.98381295 0.98384201 0.98922801] mean value: 0.9814796636658357 key: test_fscore value: [0.9375 0.93548387 0.96875 0.92307692 0.96875 0.96774194 0.96774194 0.94915254 0.96774194 0.91525424] mean value: 0.9501193380157295 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:135: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:138: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [0.97864769 0.98389982 0.98389982 0.98566308 0.98039216 0.97682709 0.96969697 0.98395722 0.98395722 0.98932384] mean value: 0.9816264914441175 key: test_precision value: [0.90909091 0.93548387 0.93939394 0.88235294 0.93939394 0.96774194 0.96774194 1. 0.96774194 0.93103448] mean value: 0.9439975889233234 key: train_precision value: [0.96830986 0.97864769 0.97864769 0.98214286 0.97173145 0.96819788 0.96113074 0.97526502 0.97526502 0.98233216] mean value: 0.9741670351447366 key: test_recall value: [0.96774194 0.93548387 1. 0.96774194 1. 0.96774194 0.96774194 0.90322581 0.96774194 0.9 ] mean value: 0.957741935483871 key: train_recall value: [0.98920863 0.98920863 0.98920863 0.98920863 0.98920863 0.98561151 0.97841727 0.99280576 0.99280576 0.99641577] mean value: 0.9892099223846729 key: test_roc_auc value: [0.93548387 0.93548387 0.96774194 0.91935484 0.96774194 0.96774194 0.96774194 0.9516129 0.9672043 0.91774194] mean value: 0.9497849462365592 key: train_roc_auc value: [0.97841727 0.98381295 0.98381295 0.98561151 0.98021583 0.97661871 0.96942446 0.98381295 0.98385807 0.98921508] mean value: 0.9814799773084758 key: test_jcc value: [0.88235294 0.87878788 0.93939394 0.85714286 0.93939394 0.9375 0.9375 0.90322581 0.9375 0.84375 ] mean value: 0.9056547362346699 key: train_jcc value: [0.95818815 0.96830986 0.96830986 0.97173145 0.96153846 0.95470383 0.94117647 0.96842105 0.96842105 0.97887324] mean value: 0.9639673429962302 MCC on Blind test: 0.19 Accuracy on Blind test: 0.47 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03648996 0.06613564 0.06211376 0.06729698 0.04555464 0.04708934 0.04693675 0.05631852 0.06484437 0.04323673] mean value: 0.05360167026519776 key: score_time value: [0.01524282 0.01525187 0.0154314 0.02462435 0.01772332 0.01929474 0.0188992 0.01929641 0.01779008 0.015306 ] mean value: 0.017886018753051756 key: test_mcc value: [0.93548387 0.64549722 0.83914639 0.79471941 0.83914639 0.93548387 0.67883359 0.80813523 0.67204301 0.80516731] mean value: 0.7953656310014937 key: train_mcc value: [0.89986294 0.86386843 0.84690871 0.85666952 0.83184526 0.84927258 0.86038603 0.86065376 0.85403593 0.85691637] mean value: 0.858041952871322 key: test_accuracy value: [0.96774194 0.82258065 0.91935484 0.88709677 0.91935484 0.96774194 0.83870968 0.90322581 0.83606557 0.90163934] mean value: 0.896351136964569 key: train_accuracy value: [0.94964029 0.93165468 0.92266187 0.92805755 0.91546763 0.92446043 0.92985612 0.92985612 0.92639138 0.92818671] mean value: 0.9286232773206928 key: test_fscore value: [0.96774194 0.82539683 0.92063492 0.89855072 0.92063492 0.96774194 0.84375 0.9 0.83870968 0.90322581] mean value: 0.8986386746143057 key: train_fscore value: [0.95053004 0.93286219 0.92495637 0.92932862 0.91739895 0.92553191 0.93121693 0.9314587 0.92819615 0.92957746] mean value: 0.9301057321039911 key: test_precision value: [0.96774194 0.8125 0.90625 0.81578947 0.90625 0.96774194 0.81818182 0.93103448 0.83870968 0.875 ] mean value: 0.8839199323011746 key: train_precision value: [0.93402778 0.91666667 0.89830508 0.91319444 0.89690722 0.91258741 0.91349481 0.91065292 0.90443686 0.91349481] mean value: 0.911376800312453 key: test_recall value: [0.96774194 0.83870968 0.93548387 1. 0.93548387 0.96774194 0.87096774 0.87096774 0.83870968 0.93333333] mean value: 0.9159139784946236 key: train_recall value: [0.9676259 0.94964029 0.95323741 0.94604317 0.93884892 0.93884892 0.94964029 0.95323741 0.95323741 0.94623656] mean value: 0.949659627137 key: test_roc_auc value: [0.96774194 0.82258065 0.91935484 0.88709677 0.91935484 0.96774194 0.83870968 0.90322581 0.83602151 0.90215054] mean value: 0.8963978494623657 key: train_roc_auc value: [0.94964029 0.93165468 0.92266187 0.92805755 0.91546763 0.92446043 0.92985612 0.92985612 0.92643949 0.92815425] mean value: 0.9286248420618344 key: test_jcc value: [0.9375 0.7027027 0.85294118 0.81578947 0.85294118 0.9375 0.72972973 0.81818182 0.72222222 0.82352941] mean value: 0.8193037711226565 key: train_jcc value: [0.90572391 0.87417219 0.86038961 0.8679868 0.8474026 0.86138614 0.87128713 0.87171053 0.86601307 0.86842105] mean value: 0.8694493015795971 MCC on Blind test: 0.23 Accuracy on Blind test: 0.53 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.92916417 0.92191863 1.12244606 0.9350934 1.06614399 0.9190464 1.05714083 0.94225955 1.10928631 0.9236002 ] mean value: 0.99260995388031 key: score_time value: [0.01470137 0.0223763 0.01537371 0.01549363 0.01523066 0.01540756 0.01223493 0.01533461 0.01543403 0.01531196] mean value: 0.015689873695373537 key: test_mcc value: [0.93548387 0.90369611 0.96824584 0.90369611 0.90748521 0.96824584 0.90369611 0.90748521 0.80322581 0.90586325] mean value: 0.9107123373005651 key: train_mcc value: [0.97844259 0.97844259 1. 1. 1. 0.97482645 0.99640932 0.97482645 0.98205307 1. ] mean value: 0.9885000467162429 key: test_accuracy value: [0.96774194 0.9516129 0.98387097 0.9516129 0.9516129 0.98387097 0.9516129 0.9516129 0.90163934 0.95081967] mean value: 0.9546007403490218 key: train_accuracy value: [0.98920863 0.98920863 1. 1. 1. 0.98741007 0.99820144 0.98741007 0.99102334 1. ] mean value: 0.9942462188238637 key: test_fscore value: [0.96774194 0.95238095 0.98360656 0.95238095 0.94915254 0.98360656 0.95081967 0.94915254 0.90322581 0.94736842] mean value: 0.9539435939381029 key: train_fscore value: [0.98924731 0.98924731 1. 1. 1. 0.98743268 0.9981982 0.98743268 0.99102334 1. ] mean value: 0.9942581511261652 key: test_precision value: [0.96774194 0.9375 1. 0.9375 1. 1. 0.96666667 1. 0.90322581 1. ] mean value: 0.9712634408602151 key: train_precision value: [0.98571429 0.98571429 1. 1. 1. 0.98566308 1. 0.98566308 0.98924731 1. ] mean value: 0.993200204813108 key: test_recall value: [0.96774194 0.96774194 0.96774194 0.96774194 0.90322581 0.96774194 0.93548387 0.90322581 0.90322581 0.9 ] mean value: 0.9383870967741935 key: train_recall value: [0.99280576 0.99280576 1. 1. 1. 0.98920863 0.99640288 0.98920863 0.99280576 1. ] mean value: 0.9953237410071942 key: test_roc_auc value: [0.96774194 0.9516129 0.98387097 0.9516129 0.9516129 0.98387097 0.9516129 0.9516129 0.9016129 0.95 ] mean value: 0.9545161290322581 key: train_roc_auc value: [0.98920863 0.98920863 1. 1. 1. 0.98741007 0.99820144 0.98741007 0.99102653 1. ] mean value: 0.9942465382532684 key: test_jcc value: [0.9375 0.90909091 0.96774194 0.90909091 0.90322581 0.96774194 0.90625 0.90322581 0.82352941 0.9 ] mean value: 0.9127396713817492 key: train_jcc value: [0.9787234 0.9787234 1. 1. 1. 0.9751773 0.99640288 0.9751773 0.98220641 1. ] mean value: 0.9886410701831508 MCC on Blind test: 0.12 Accuracy on Blind test: 0.39 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01557422 0.01259375 0.01137304 0.01015782 0.01030827 0.01022148 0.010221 0.01175594 0.01026297 0.01016307] mean value: 0.011263155937194824 key: score_time value: [0.01154852 0.01055121 0.00985241 0.00890565 0.00945139 0.00883079 0.00884581 0.00965166 0.0088501 0.00897288] mean value: 0.009546041488647461 key: test_mcc value: [0.48488114 0.22580645 0.71004695 0.62471615 0.71004695 0.84266484 0.42289003 0.5809475 0.50807349 0.60645161] mean value: 0.5716525113259076 key: train_mcc value: [0.53865672 0.62262853 0.58994332 0.60090995 0.59713776 0.61151079 0.61520743 0.58992806 0.60144143 0.60146217] mean value: 0.596882614684279 key: test_accuracy value: [0.74193548 0.61290323 0.85483871 0.80645161 0.85483871 0.91935484 0.70967742 0.79032258 0.75409836 0.80327869] mean value: 0.7847699629825489 key: train_accuracy value: [0.76438849 0.81115108 0.79496403 0.80035971 0.79856115 0.8057554 0.80755396 0.79496403 0.80071813 0.80071813] mean value: 0.7979134107435775 key: test_fscore value: [0.75 0.61290323 0.85714286 0.82352941 0.85245902 0.92307692 0.72727273 0.78688525 0.76190476 0.8 ] mean value: 0.7895174169263509 key: train_fscore value: [0.78489327 0.81415929 0.79569892 0.80284192 0.79928315 0.8057554 0.80580762 0.79496403 0.80071813 0.80213904] mean value: 0.8006260774087884 key: test_precision value: [0.72727273 0.61290323 0.84375 0.75675676 0.86666667 0.88235294 0.68571429 0.8 0.75 0.8 ] mean value: 0.7725416603393359 key: train_precision value: [0.72205438 0.80139373 0.79285714 0.79298246 0.79642857 0.8057554 0.81318681 0.79496403 0.79928315 0.79787234] mean value: 0.7916778011508354 key: test_recall value: [0.77419355 0.61290323 0.87096774 0.90322581 0.83870968 0.96774194 0.77419355 0.77419355 0.77419355 0.8 ] mean value: 0.8090322580645162 key: train_recall value: [0.85971223 0.82733813 0.79856115 0.81294964 0.80215827 0.8057554 0.79856115 0.79496403 0.80215827 0.80645161] mean value: 0.8108609886284521 key: test_roc_auc value: [0.74193548 0.61290323 0.85483871 0.80645161 0.85483871 0.91935484 0.70967742 0.79032258 0.75376344 0.80322581] mean value: 0.784731182795699 key: train_roc_auc value: [0.76438849 0.81115108 0.79496403 0.80035971 0.79856115 0.8057554 0.80755396 0.79496403 0.80072071 0.80070782] mean value: 0.79791263763183 key: test_jcc value: [0.6 0.44186047 0.75 0.7 0.74285714 0.85714286 0.57142857 0.64864865 0.61538462 0.66666667] mean value: 0.6593988967244782 key: train_jcc value: [0.64594595 0.68656716 0.66071429 0.67062315 0.66567164 0.6746988 0.67477204 0.65970149 0.66766467 0.66964286] mean value: 0.6676002035024714 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.0104816 0.01038766 0.01042938 0.01051688 0.01049376 0.01051188 0.01045275 0.01059914 0.01046395 0.01043129] mean value: 0.010476827621459961 key: score_time value: [0.00892305 0.00895 0.00893521 0.0091629 0.00889969 0.00888014 0.008919 0.00981951 0.00892782 0.00894523] mean value: 0.0090362548828125 key: test_mcc value: [0.64820372 0.55301004 0.5809475 0.49319696 0.74348441 0.67883359 0.58338335 0.58338335 0.57419355 0.73763441] mean value: 0.6176270892557514 key: train_mcc value: [0.6587893 0.67364319 0.6549475 0.6870548 0.6618705 0.68389584 0.67364319 0.68084793 0.67053524 0.65891743] mean value: 0.6704144922517364 key: test_accuracy value: [0.82258065 0.77419355 0.79032258 0.74193548 0.87096774 0.83870968 0.79032258 0.79032258 0.78688525 0.86885246] mean value: 0.807509254362771 key: train_accuracy value: [0.82913669 0.83633094 0.82733813 0.84352518 0.83093525 0.84172662 0.83633094 0.83992806 0.83482944 0.82944345] mean value: 0.8349524689045891 key: test_fscore value: [0.81355932 0.78787879 0.79365079 0.76470588 0.86666667 0.83333333 0.8 0.8 0.78688525 0.86666667] mean value: 0.8113346698484727 key: train_fscore value: [0.8324515 0.84063047 0.82978723 0.8438061 0.83093525 0.84452297 0.84063047 0.8441331 0.83859649 0.83065954] mean value: 0.8376153130590535 key: test_precision value: [0.85714286 0.74285714 0.78125 0.7027027 0.89655172 0.86206897 0.76470588 0.76470588 0.8 0.86666667] mean value: 0.8038651823730424 key: train_precision value: [0.816609 0.81911263 0.81818182 0.84229391 0.83093525 0.82986111 0.81911263 0.8225256 0.81849315 0.82624113] mean value: 0.8243366223120344 key: test_recall value: [0.77419355 0.83870968 0.80645161 0.83870968 0.83870968 0.80645161 0.83870968 0.83870968 0.77419355 0.86666667] mean value: 0.8221505376344086 key: train_recall value: [0.84892086 0.86330935 0.84172662 0.84532374 0.83093525 0.85971223 0.86330935 0.86690647 0.85971223 0.83512545] mean value: 0.8514981563136588 key: test_roc_auc value: [0.82258065 0.77419355 0.79032258 0.74193548 0.87096774 0.83870968 0.79032258 0.79032258 0.78709677 0.8688172 ] mean value: 0.8075268817204301 key: train_roc_auc value: [0.82913669 0.83633094 0.82733813 0.84352518 0.83093525 0.84172662 0.83633094 0.83992806 0.83487404 0.82943323] mean value: 0.8349559062427477 key: test_jcc value: [0.68571429 0.65 0.65789474 0.61904762 0.76470588 0.71428571 0.66666667 0.66666667 0.64864865 0.76470588] mean value: 0.6838336102577589 key: train_jcc value: [0.71299094 0.72507553 0.70909091 0.72981366 0.71076923 0.73088685 0.72507553 0.73030303 0.72205438 0.71036585] mean value: 0.7206425913193242 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01108789 0.0107801 0.01099753 0.01079965 0.0109024 0.00961852 0.00987792 0.01094413 0.01082826 0.01088762] mean value: 0.010672402381896973 key: score_time value: [0.0130856 0.01319814 0.01321411 0.01798558 0.01740742 0.01253605 0.01425529 0.01265335 0.01299381 0.01311779] mean value: 0.014044713973999024 key: test_mcc value: [0.43405737 0.42023032 0.51856298 0.45184806 0.61418277 0.58834841 0.39223227 0.52981294 0.44301075 0.44241145] mean value: 0.48346973185248315 key: train_mcc value: [0.70569372 0.71873948 0.65294473 0.69525741 0.67838657 0.69587174 0.71239616 0.72061896 0.71764405 0.70742859] mean value: 0.7004981411109261 key: test_accuracy value: [0.70967742 0.70967742 0.75806452 0.72580645 0.80645161 0.79032258 0.69354839 0.75806452 0.72131148 0.72131148] mean value: 0.7394235854045479 key: train_accuracy value: [0.85251799 0.85791367 0.82553957 0.8471223 0.8381295 0.8471223 0.85611511 0.85971223 0.85816876 0.85278276] mean value: 0.8495124187902819 key: test_fscore value: [0.66666667 0.71875 0.74576271 0.72131148 0.8 0.77192982 0.71641791 0.72727273 0.72131148 0.71186441] mean value: 0.7301287198412298 key: train_fscore value: [0.84926471 0.85122411 0.81869159 0.84288355 0.83146067 0.84171322 0.85454545 0.85555556 0.85343228 0.84758364] mean value: 0.8446354780098349 key: test_precision value: [0.7826087 0.6969697 0.78571429 0.73333333 0.82758621 0.84615385 0.66666667 0.83333333 0.73333333 0.72413793] mean value: 0.7629837329087704 key: train_precision value: [0.86842105 0.89328063 0.85214008 0.86692015 0.8671875 0.87258687 0.86397059 0.88167939 0.88122605 0.88030888] mean value: 0.8727721199038784 key: test_recall value: [0.58064516 0.74193548 0.70967742 0.70967742 0.77419355 0.70967742 0.77419355 0.64516129 0.70967742 0.7 ] mean value: 0.7054838709677419 key: train_recall value: [0.83093525 0.81294964 0.78776978 0.82014388 0.79856115 0.81294964 0.84532374 0.83093525 0.82733813 0.8172043 ] mean value: 0.8184110775895412 key: test_roc_auc value: [0.70967742 0.70967742 0.75806452 0.72580645 0.80645161 0.79032258 0.69354839 0.75806452 0.72150538 0.72096774] mean value: 0.7394086021505376 key: train_roc_auc value: [0.85251799 0.85791367 0.82553957 0.8471223 0.8381295 0.8471223 0.85611511 0.85971223 0.85811351 0.85284675] mean value: 0.849513292591733 key: test_jcc value: [0.5 0.56097561 0.59459459 0.56410256 0.66666667 0.62857143 0.55813953 0.57142857 0.56410256 0.55263158] mean value: 0.5761213113053576 key: train_jcc value: [0.73801917 0.74098361 0.69303797 0.7284345 0.71153846 0.7266881 0.74603175 0.74757282 0.74433657 0.73548387] mean value: 0.7312126821907436 MCC on Blind test: 0.17 Accuracy on Blind test: 0.55 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.03354144 0.02489185 0.02916217 0.02847791 0.02713418 0.02547169 0.02499604 0.025105 0.02454877 0.02502036] mean value: 0.026834940910339354 key: score_time value: [0.0131855 0.01280403 0.01284194 0.01289439 0.01388645 0.01286411 0.01271367 0.01277924 0.01253557 0.01256752] mean value: 0.01290724277496338 key: test_mcc value: [0.81325006 0.58834841 0.7190925 0.70116959 0.83914639 0.83914639 0.64820372 0.74348441 0.67314268 0.74460444] mean value: 0.7309588598194308 key: train_mcc value: [0.79601542 0.78762489 0.79209132 0.79002705 0.79151169 0.78818066 0.80264269 0.78357621 0.79775107 0.82936203] mean value: 0.7958783038857233 key: test_accuracy value: [0.90322581 0.79032258 0.85483871 0.83870968 0.91935484 0.91935484 0.82258065 0.87096774 0.83606557 0.86885246] mean value: 0.8624272871496562 key: train_accuracy value: [0.89568345 0.89208633 0.89388489 0.89388489 0.89388489 0.89208633 0.89928058 0.88848921 0.89587074 0.91382406] mean value: 0.8958975369076373 key: test_fscore value: [0.90909091 0.80597015 0.86567164 0.85714286 0.92063492 0.92063492 0.83076923 0.875 0.84375 0.875 ] mean value: 0.8703664629317615 key: train_fscore value: [0.90102389 0.89690722 0.8991453 0.89774697 0.89879931 0.89726027 0.90410959 0.89527027 0.90169492 0.91666667] mean value: 0.9008624402594712 key: test_precision value: [0.85714286 0.75 0.80555556 0.76923077 0.90625 0.90625 0.79411765 0.84848485 0.81818182 0.82352941] mean value: 0.8278742907419379 key: train_precision value: [0.85714286 0.85855263 0.85667752 0.86622074 0.85901639 0.85620915 0.8627451 0.84394904 0.8525641 0.88888889] mean value: 0.860196642678534 key: test_recall value: [0.96774194 0.87096774 0.93548387 0.96774194 0.93548387 0.93548387 0.87096774 0.90322581 0.87096774 0.93333333] mean value: 0.9191397849462366 key: train_recall value: [0.94964029 0.93884892 0.94604317 0.93165468 0.94244604 0.94244604 0.94964029 0.95323741 0.95683453 0.94623656] mean value: 0.945702792604626 key: test_roc_auc value: [0.90322581 0.79032258 0.85483871 0.83870968 0.91935484 0.91935484 0.82258065 0.87096774 0.83548387 0.86989247] mean value: 0.86247311827957 key: train_roc_auc value: [0.89568345 0.89208633 0.89388489 0.89388489 0.89388489 0.89208633 0.89928058 0.88848921 0.89597999 0.91376576] mean value: 0.8959026327325236 key: test_jcc value: [0.83333333 0.675 0.76315789 0.75 0.85294118 0.85294118 0.71052632 0.77777778 0.72972973 0.77777778] mean value: 0.7723185182086111 key: train_jcc value: [0.81987578 0.81308411 0.81677019 0.81446541 0.81619938 0.8136646 0.825 0.81039755 0.82098765 0.84615385] mean value: 0.8196598510899469 MCC on Blind test: 0.22 Accuracy on Blind test: 0.49 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.95948577 2.15034413 2.08079267 2.13337517 2.1066184 2.17374015 2.10629272 2.13330579 2.08265138 2.28587055] mean value: 2.121247673034668 key: score_time value: [0.01692438 0.01535654 0.01485014 0.02322173 0.02015829 0.01493502 0.02037382 0.01496053 0.02017426 0.02348351] mean value: 0.018443822860717773 key: test_mcc value: [0.93548387 0.87278605 0.93743687 0.87278605 0.96824584 0.90748521 0.77784447 0.93743687 0.80516731 0.81870035] mean value: 0.883337288515988 key: train_mcc value: [1. 1. 1. 0.99640932 0.99640932 1. 1. 1. 0.99284416 1. ] mean value: 0.9985662806359359 key: test_accuracy value: [0.96774194 0.93548387 0.96774194 0.93548387 0.98387097 0.9516129 0.88709677 0.96774194 0.90163934 0.90163934] mean value: 0.9400052882072978 key: train_accuracy value: [1. 1. 1. 0.99820144 0.99820144 1. 1. 1. 0.99640934 1. ] mean value: 0.9992812213424951 key: test_fscore value: [0.96774194 0.9375 0.96666667 0.9375 0.98360656 0.94915254 0.88135593 0.96666667 0.9 0.88888889] mean value: 0.9379079189659413 key: train_fscore value: [1. 1. 1. 0.99820467 0.9981982 1. 1. 1. 0.99638989 1. ] mean value: 0.9992792757758504 key: test_precision value: [0.96774194 0.90909091 1. 0.90909091 1. 1. 0.92857143 1. 0.93103448 1. ] mean value: 0.9645529664995738 key: train_precision value: [1. 1. 1. 0.99641577 1. 1. 1. 1. 1. 1. ] mean value: 0.9996415770609319 key: test_recall value: [0.96774194 0.96774194 0.93548387 0.96774194 0.96774194 0.90322581 0.83870968 0.93548387 0.87096774 0.8 ] mean value: 0.915483870967742 key: train_recall value: [1. 1. 1. 1. 0.99640288 1. 1. 1. 0.99280576 1. ] mean value: 0.9989208633093525 key: test_roc_auc value: [0.96774194 0.93548387 0.96774194 0.93548387 0.98387097 0.9516129 0.88709677 0.96774194 0.90215054 0.9 ] mean value: 0.9398924731182796 key: train_roc_auc value: [1. 1. 1. 0.99820144 0.99820144 1. 1. 1. 0.99640288 1. ] mean value: 0.9992805755395684 key: test_jcc value: [0.9375 0.88235294 0.93548387 0.88235294 0.96774194 0.90322581 0.78787879 0.93548387 0.81818182 0.8 ] mean value: 0.8850201972284515 key: train_jcc value: [1. 1. 1. 0.99641577 0.99640288 1. 1. 1. 0.99280576 1. ] mean value: 0.9985624403702844 MCC on Blind test: 0.14 Accuracy on Blind test: 0.35 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02974105 0.01884413 0.02266383 0.02178121 0.02214408 0.02337813 0.02314329 0.02074575 0.01937366 0.0216651 ] mean value: 0.0223480224609375 key: score_time value: [0.01200747 0.00932431 0.00934696 0.00890899 0.00897503 0.00935054 0.00896859 0.00912809 0.00909376 0.0090301 ] mean value: 0.009413385391235351 key: test_mcc value: [1. 0.87096774 1. 0.90369611 1. 0.93743687 0.93743687 0.96824584 0.9344086 0.93635873] mean value: 0.948855076082296 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.93548387 1. 0.9516129 1. 0.96774194 0.96774194 0.98387097 0.96721311 0.96721311] mean value: 0.9740877842411423 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.93548387 1. 0.95238095 1. 0.96666667 0.96666667 0.98360656 0.96774194 0.96551724] mean value: 0.9738063890922258 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.93548387 1. 0.9375 1. 1. 1. 1. 0.96774194 1. ] mean value: 0.9840725806451613 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.93548387 1. 0.96774194 1. 0.93548387 0.93548387 0.96774194 0.96774194 0.93333333] mean value: 0.9643010752688173 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.93548387 1. 0.9516129 1. 0.96774194 0.96774194 0.98387097 0.9672043 0.96666667] mean value: 0.9740322580645162 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.87878788 1. 0.90909091 1. 0.93548387 0.93548387 0.96774194 0.9375 0.93333333] mean value: 0.9497421798631476 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.01 Accuracy on Blind test: 0.2 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.12657785 0.12585592 0.12568116 0.1251936 0.12818575 0.12904644 0.12710953 0.12752652 0.12936378 0.12759137] mean value: 0.127213191986084 key: score_time value: [0.0178988 0.01849961 0.01826978 0.0182507 0.01817966 0.01839781 0.01924253 0.01940107 0.01964402 0.0183084 ] mean value: 0.018609237670898438 key: test_mcc value: [0.90369611 0.80813523 0.90748521 0.90369611 1. 0.90748521 0.83914639 0.87278605 0.83655914 0.96770777] mean value: 0.8946697236305513 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9516129 0.90322581 0.9516129 0.9516129 1. 0.9516129 0.91935484 0.93548387 0.91803279 0.98360656] mean value: 0.9466155473294553 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.95238095 0.90625 0.95384615 0.95238095 1. 0.94915254 0.92063492 0.9375 0.91803279 0.98305085] mean value: 0.9473229155958733 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9375 0.87878788 0.91176471 0.9375 1. 1. 0.90625 0.90909091 0.93333333 1. ] mean value: 0.9414226827094474 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 1. 0.96774194 1. 0.90322581 0.93548387 0.96774194 0.90322581 0.96666667] mean value: 0.9547311827956989 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9516129 0.90322581 0.9516129 0.9516129 1. 0.9516129 0.91935484 0.93548387 0.91827957 0.98333333] mean value: 0.9466129032258065 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.90909091 0.82857143 0.91176471 0.90909091 1. 0.90322581 0.85294118 0.88235294 0.84848485 0.96666667] mean value: 0.9012189391885786 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.16 Accuracy on Blind test: 0.36 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01068401 0.01052618 0.0105834 0.01052594 0.0105145 0.01098371 0.01058984 0.010818 0.01097584 0.01046133] mean value: 0.010666275024414062 key: score_time value: [0.00883293 0.00899005 0.00913048 0.00909328 0.00905633 0.00895715 0.00912786 0.008919 0.00896478 0.00893164] mean value: 0.009000349044799804 key: test_mcc value: [0.55895656 0.77784447 0.49319696 0.65372045 0.77784447 0.7190925 0.75623534 0.80645161 0.64895138 0.90586325] mean value: 0.7098156990531378 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.77419355 0.88709677 0.74193548 0.82258065 0.88709677 0.85483871 0.87096774 0.90322581 0.81967213 0.95081967] mean value: 0.8512427287149656 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.75 0.89230769 0.71428571 0.80701754 0.88135593 0.84210526 0.88235294 0.90322581 0.80701754 0.94736842] mean value: 0.8427036858354704 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.84 0.85294118 0.8 0.88461538 0.92857143 0.92307692 0.81081081 0.90322581 0.88461538 1. ] mean value: 0.8827856914612133 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.67741935 0.93548387 0.64516129 0.74193548 0.83870968 0.77419355 0.96774194 0.90322581 0.74193548 0.9 ] mean value: 0.8125806451612904 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.77419355 0.88709677 0.74193548 0.82258065 0.88709677 0.85483871 0.87096774 0.90322581 0.82096774 0.95 ] mean value: 0.8512903225806452 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.6 0.80555556 0.55555556 0.67647059 0.78787879 0.72727273 0.78947368 0.82352941 0.67647059 0.9 ] mean value: 0.7342206898708447 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.1 Accuracy on Blind test: 0.43 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.87444687 1.84364486 1.88241863 1.88341451 1.86338425 1.8370254 1.95046043 2.00012541 2.00316405 2.00365019] mean value: 1.914173460006714 key: score_time value: [0.09367371 0.09850073 0.09885931 0.09624052 0.09289575 0.09235954 0.10137892 0.10190368 0.10188222 0.10096431] mean value: 0.0978658676147461 key: test_mcc value: [1. 0.90369611 1. 0.90369611 1. 1. 0.90369611 0.96824584 0.93635873 0.93635873] mean value: 0.9552051644792718 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.9516129 1. 0.9516129 1. 1. 0.9516129 0.98387097 0.96721311 0.96721311] mean value: 0.9773135906927551 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.95238095 1. 0.95238095 1. 1. 0.95081967 0.98360656 0.96875 0.96551724] mean value: 0.9773455375649411 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9375 1. 0.9375 1. 1. 0.96666667 1. 0.93939394 1. ] mean value: 0.9781060606060606 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 1. 1. 0.93548387 0.96774194 1. 0.93333333] mean value: 0.9772043010752688 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.9516129 1. 0.9516129 1. 1. 0.9516129 0.98387097 0.96666667 0.96666667] mean value: 0.9772043010752689 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.90909091 1. 0.90909091 1. 1. 0.90625 0.96774194 0.93939394 0.93333333] mean value: 0.9564901026392962 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.23 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000...05', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.04206729 1.01224828 1.05134225 1.03499246 0.99280858 1.02267337 0.99256825 0.97221684 1.08185887 1.01412582] mean value: 1.0216902017593383 key: score_time value: [0.19844055 0.23611355 0.22823763 0.22109938 0.14304137 0.20978165 0.21308875 0.23787618 0.25533915 0.21346045] mean value: 0.21564786434173583 key: test_mcc value: [1. 0.83914639 0.96824584 0.87831007 1. 1. 0.90369611 0.96824584 0.90204573 0.8688172 ] mean value: 0.9328507181352339 key: train_mcc value: [0.98207157 0.98202074 0.97132357 0.98563702 0.97844259 0.97844259 0.97497785 0.98207157 0.97848145 0.98210326] mean value: 0.9795572214421221 key: test_accuracy value: [1. 0.91935484 0.98387097 0.93548387 1. 1. 0.9516129 0.98387097 0.95081967 0.93442623] mean value: 0.9659439450026441 key: train_accuracy value: [0.99100719 0.99100719 0.98561151 0.99280576 0.98920863 0.98920863 0.98741007 0.99100719 0.98922801 0.99102334] mean value: 0.9897517533549461 key: test_fscore value: [1. 0.92063492 0.98360656 0.93939394 1. 1. 0.95081967 0.98360656 0.95238095 0.93333333] mean value: 0.9663775932628391 key: train_fscore value: [0.99105546 0.99102334 0.98571429 0.99283154 0.98924731 0.98924731 0.98752228 0.99105546 0.98924731 0.99108734] mean value: 0.9898031639746488 key: test_precision value: [1. 0.90625 1. 0.88571429 1. 1. 0.96666667 1. 0.9375 0.93333333] mean value: 0.9629464285714285 key: train_precision value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.98576512 0.98924731 0.9787234 0.98928571 0.98571429 0.98571429 0.97879859 0.98576512 0.98571429 0.9858156 ] mean value: 0.9850543726031485 key: test_recall value: [1. 0.93548387 0.96774194 1. 1. 1. 0.93548387 0.96774194 0.96774194 0.93333333] mean value: 0.970752688172043 key: train_recall value: [0.99640288 0.99280576 0.99280576 0.99640288 0.99280576 0.99280576 0.99640288 0.99640288 0.99280576 0.99641577] mean value: 0.9946056058379104 key: test_roc_auc value: [1. 0.91935484 0.98387097 0.93548387 1. 1. 0.9516129 0.98387097 0.95053763 0.9344086 ] mean value: 0.9659139784946237 key: train_roc_auc value: [0.99100719 0.99100719 0.98561151 0.99280576 0.98920863 0.98920863 0.98741007 0.99100719 0.98923442 0.99101364] mean value: 0.9897514246667182 key: test_jcc value: [1. 0.85294118 0.96774194 0.88571429 1. 1. 0.90625 0.96774194 0.90909091 0.875 ] mean value: 0.9364480242243525 key: train_jcc value: [0.9822695 0.98220641 0.97183099 0.98576512 0.9787234 0.9787234 0.97535211 0.9822695 0.9787234 0.98233216] mean value: 0.9798196004175848 MCC on Blind test: 0.12 Accuracy on Blind test: 0.26 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01132822 0.01071429 0.01171923 0.0117836 0.01190042 0.01191497 0.01188326 0.01205683 0.01197553 0.01189804] mean value: 0.011717438697814941 key: score_time value: [0.0150733 0.0090642 0.00976753 0.00985432 0.00976515 0.00981903 0.00985765 0.0098536 0.00985885 0.00983262] mean value: 0.010274624824523926 key: test_mcc value: [0.64820372 0.55301004 0.5809475 0.49319696 0.74348441 0.67883359 0.58338335 0.58338335 0.57419355 0.73763441] mean value: 0.6176270892557514 key: train_mcc value: [0.6587893 0.67364319 0.6549475 0.6870548 0.6618705 0.68389584 0.67364319 0.68084793 0.67053524 0.65891743] mean value: 0.6704144922517364 key: test_accuracy value: [0.82258065 0.77419355 0.79032258 0.74193548 0.87096774 0.83870968 0.79032258 0.79032258 0.78688525 0.86885246] mean value: 0.807509254362771 key: train_accuracy value: [0.82913669 0.83633094 0.82733813 0.84352518 0.83093525 0.84172662 0.83633094 0.83992806 0.83482944 0.82944345] mean value: 0.8349524689045891 key: test_fscore value: [0.81355932 0.78787879 0.79365079 0.76470588 0.86666667 0.83333333 0.8 0.8 0.78688525 0.86666667] mean value: 0.8113346698484727 key: train_fscore value: [0.8324515 0.84063047 0.82978723 0.8438061 0.83093525 0.84452297 0.84063047 0.8441331 0.83859649 0.83065954] mean value: 0.8376153130590535 key: test_precision value: [0.85714286 0.74285714 0.78125 0.7027027 0.89655172 0.86206897 0.76470588 0.76470588 0.8 0.86666667] mean value: 0.8038651823730424 key: train_precision value: [0.816609 0.81911263 0.81818182 0.84229391 0.83093525 0.82986111 0.81911263 0.8225256 0.81849315 0.82624113] mean value: 0.8243366223120344 key: test_recall value: [0.77419355 0.83870968 0.80645161 0.83870968 0.83870968 0.80645161 0.83870968 0.83870968 0.77419355 0.86666667] mean value: 0.8221505376344086 key: train_recall value: [0.84892086 0.86330935 0.84172662 0.84532374 0.83093525 0.85971223 0.86330935 0.86690647 0.85971223 0.83512545] mean value: 0.8514981563136588 key: test_roc_auc value: [0.82258065 0.77419355 0.79032258 0.74193548 0.87096774 0.83870968 0.79032258 0.79032258 0.78709677 0.8688172 ] mean value: 0.8075268817204301 key: train_roc_auc value: [0.82913669 0.83633094 0.82733813 0.84352518 0.83093525 0.84172662 0.83633094 0.83992806 0.83487404 0.82943323] mean value: 0.8349559062427477 key: test_jcc value: [0.68571429 0.65 0.65789474 0.61904762 0.76470588 0.71428571 0.66666667 0.66666667 0.64864865 0.76470588] mean value: 0.6838336102577589 key: train_jcc value: [0.71299094 0.72507553 0.70909091 0.72981366 0.71076923 0.73088685 0.72507553 0.73030303 0.72205438 0.71036585] mean value: 0.7206425913193242 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.09153771 0.06604338 0.07178664 0.0755806 0.09721923 0.07113528 0.07501578 0.08107591 0.09011579 0.07580256] mean value: 0.07953128814697266 key: score_time value: [0.01134396 0.01080441 0.01218295 0.01085663 0.01143789 0.01116538 0.01172423 0.01230764 0.01139951 0.01124096] mean value: 0.011446356773376465 key: test_mcc value: [1. 0.93548387 1. 0.96824584 0.96824584 0.93743687 1. 0.93743687 1. 0.93635873] mean value: 0.9683208010141471 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.96774194 1. 0.98387097 0.98387097 0.96774194 1. 0.96774194 1. 0.96721311] mean value: 0.9838180856689582 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.96774194 1. 0.98412698 0.98360656 0.96666667 1. 0.96666667 1. 0.96551724] mean value: 0.9834326051700548 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.96774194 1. 0.96875 1. 1. 1. 1. 1. 1. ] mean value: 0.9936491935483871 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 1. 0.96774194 0.93548387 1. 0.93548387 1. 0.93333333] mean value: 0.9739784946236559 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.96774194 1. 0.98387097 0.98387097 0.96774194 1. 0.96774194 1. 0.96666667] mean value: 0.983763440860215 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.9375 1. 0.96875 0.96774194 0.93548387 1. 0.93548387 1. 0.93333333] mean value: 0.9678293010752688 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04852033 0.05345178 0.04365349 0.07337523 0.0658989 0.07849956 0.07661462 0.05499148 0.07248783 0.0457654 ] mean value: 0.06132586002349853 key: score_time value: [0.01924372 0.01231289 0.01934409 0.01247478 0.02116656 0.01962399 0.01251864 0.01944685 0.01242089 0.01249909] mean value: 0.016105151176452635 key: test_mcc value: [0.83914639 0.87096774 1. 0.77784447 0.96824584 0.87278605 0.90369611 0.87278605 0.9344086 0.73763441] mean value: 0.8777515659651098 key: train_mcc value: [0.95705746 0.96412858 0.94283651 0.96058703 0.94634322 0.96048758 0.94266562 0.95353974 0.96419362 0.9679883 ] mean value: 0.9559827674087997 key: test_accuracy value: [0.91935484 0.93548387 1. 0.88709677 0.98387097 0.93548387 0.9516129 0.93548387 0.96721311 0.86885246] mean value: 0.9384452670544685 key: train_accuracy value: [0.97841727 0.98201439 0.97122302 0.98021583 0.97302158 0.98021583 0.97122302 0.97661871 0.98204668 0.98384201] mean value: 0.977883832969531 key: test_fscore value: [0.91803279 0.93548387 1. 0.89230769 0.98360656 0.93333333 0.95081967 0.93333333 0.96774194 0.86666667] mean value: 0.9381325848486082 key: train_fscore value: [0.97864769 0.98214286 0.97163121 0.98039216 0.97335702 0.980322 0.97153025 0.97690941 0.98214286 0.9840708 ] mean value: 0.9781146242643416 key: test_precision value: [0.93333333 0.93548387 1. 0.85294118 1. 0.96551724 0.96666667 0.96551724 0.96774194 0.86666667] mean value: 0.9453868132347488 key: train_precision value: [0.96830986 0.9751773 0.95804196 0.97173145 0.96140351 0.97508897 0.96126761 0.96491228 0.9751773 0.97202797] mean value: 0.9683138210996206 key: test_recall value: [0.90322581 0.93548387 1. 0.93548387 0.96774194 0.90322581 0.93548387 0.90322581 0.96774194 0.86666667] mean value: 0.9318279569892473 key: train_recall value: [0.98920863 0.98920863 0.98561151 0.98920863 0.98561151 0.98561151 0.98201439 0.98920863 0.98920863 0.99641577] mean value: 0.9881307856940253 key: test_roc_auc value: [0.91935484 0.93548387 1. 0.88709677 0.98387097 0.93548387 0.9516129 0.93548387 0.9672043 0.8688172 ] mean value: 0.9384408602150538 key: train_roc_auc value: [0.97841727 0.98201439 0.97122302 0.98021583 0.97302158 0.98021583 0.97122302 0.97661871 0.98205951 0.9838194 ] mean value: 0.9778828550063176 key: test_jcc value: [0.84848485 0.87878788 1. 0.80555556 0.96774194 0.875 0.90625 0.875 0.9375 0.76470588] mean value: 0.8859026100665095 key: train_jcc value: [0.95818815 0.96491228 0.94482759 0.96153846 0.94809689 0.96140351 0.94463668 0.95486111 0.96491228 0.96864111] mean value: 0.9572018061338432 MCC on Blind test: 0.13 Accuracy on Blind test: 0.37 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01502514 0.01342988 0.01100564 0.0116806 0.01108813 0.01071095 0.01139832 0.01138282 0.01146936 0.01143527] mean value: 0.011862611770629883 key: score_time value: [0.0120914 0.01019168 0.00957203 0.00933599 0.00964856 0.00924063 0.00965118 0.00963283 0.0095036 0.00960135] mean value: 0.009846925735473633 key: test_mcc value: [0.67741935 0.54953196 0.67741935 0.59603956 0.74193548 0.77784447 0.51639778 0.58834841 0.54086022 0.8403496 ] mean value: 0.6506146178904506 key: train_mcc value: [0.63097179 0.67116969 0.64951905 0.64509217 0.6522999 0.64916414 0.66359001 0.65576325 0.64576598 0.62037396] mean value: 0.6483709942789844 key: test_accuracy value: [0.83870968 0.77419355 0.83870968 0.79032258 0.87096774 0.88709677 0.75806452 0.79032258 0.7704918 0.91803279] mean value: 0.8236911686938128 key: train_accuracy value: [0.8147482 0.83453237 0.82374101 0.82194245 0.82553957 0.82374101 0.83093525 0.82733813 0.82226212 0.80969479] mean value: 0.8234474897640236 key: test_fscore value: [0.83870968 0.78125 0.83870968 0.8115942 0.87096774 0.89230769 0.76190476 0.80597015 0.77419355 0.92063492] mean value: 0.8296242372160947 key: train_fscore value: [0.82086957 0.84083045 0.83044983 0.82722513 0.83071553 0.82986111 0.83680556 0.83216783 0.82722513 0.81533101] mean value: 0.8291481145387779 key: test_precision value: [0.83870968 0.75757576 0.83870968 0.73684211 0.87096774 0.85294118 0.75 0.75 0.77419355 0.87878788] mean value: 0.8048727563258673 key: train_precision value: [0.79461279 0.81 0.8 0.80338983 0.80677966 0.80201342 0.80872483 0.80952381 0.80338983 0.79322034] mean value: 0.803165452018711 key: test_recall value: [0.83870968 0.80645161 0.83870968 0.90322581 0.87096774 0.93548387 0.77419355 0.87096774 0.77419355 0.96666667] mean value: 0.8579569892473118 key: train_recall value: [0.84892086 0.87410072 0.86330935 0.85251799 0.85611511 0.85971223 0.86690647 0.85611511 0.85251799 0.83870968] mean value: 0.8568925504757484 key: test_roc_auc value: [0.83870968 0.77419355 0.83870968 0.79032258 0.87096774 0.88709677 0.75806452 0.79032258 0.77043011 0.9188172 ] mean value: 0.8237634408602151 key: train_roc_auc value: [0.8147482 0.83453237 0.82374101 0.82194245 0.82553957 0.82374101 0.83093525 0.82733813 0.82231634 0.80964261] mean value: 0.8234476934581367 key: test_jcc value: [0.72222222 0.64102564 0.72222222 0.68292683 0.77142857 0.80555556 0.61538462 0.675 0.63157895 0.85294118] mean value: 0.712028578094613 key: train_jcc value: [0.69616519 0.72537313 0.71005917 0.70535714 0.71044776 0.70919881 0.71940299 0.71257485 0.70535714 0.68823529] mean value: 0.7082171487122775 MCC on Blind test: 0.19 Accuracy on Blind test: 0.55 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.0255146 0.02991891 0.02861214 0.0264709 0.0301187 0.02686048 0.03000832 0.03060079 0.03761053 0.03682876] mean value: 0.030254411697387695 key: score_time value: [0.0106895 0.01152873 0.01216507 0.01219368 0.01213622 0.01212597 0.01233673 0.0122211 0.01223564 0.01243973] mean value: 0.01200723648071289 key: test_mcc value: [0.90748521 0.87096774 0.93743687 0.90369611 0.93743687 0.87278605 0.90748521 0.82199494 0.87613871 0.90204573] mean value: 0.8937473447150879 key: train_mcc value: [0.94653932 0.94986154 0.95329292 0.93644001 0.92580909 0.88612956 0.93987712 0.8782527 0.95034654 0.97487172] mean value: 0.9341420499730175 key: test_accuracy value: [0.9516129 0.93548387 0.96774194 0.9516129 0.96774194 0.93548387 0.9516129 0.90322581 0.93442623 0.95081967] mean value: 0.9449762030671602 key: train_accuracy value: [0.97302158 0.97482014 0.97661871 0.9676259 0.96223022 0.94064748 0.96942446 0.93705036 0.97486535 0.98743268] mean value: 0.9663736874055513 key: test_fscore value: [0.94915254 0.93548387 0.96666667 0.95238095 0.96875 0.9375 0.95384615 0.89285714 0.93939394 0.94915254] mean value: 0.944518381085836 key: train_fscore value: [0.9725777 0.97454545 0.97649186 0.96678967 0.96322242 0.94358974 0.97012302 0.93383743 0.97526502 0.98743268] mean value: 0.9663874986610166 key: test_precision value: [1. 0.93548387 1. 0.9375 0.93939394 0.90909091 0.91176471 1. 0.88571429 0.96551724] mean value: 0.948446495242854 key: train_precision value: [0.98884758 0.98529412 0.98181818 0.99242424 0.93856655 0.8990228 0.94845361 0.98406375 0.95833333 0.98920863] mean value: 0.9666032799430763 key: test_recall value: [0.90322581 0.93548387 0.93548387 0.96774194 1. 0.96774194 1. 0.80645161 1. 0.93333333] mean value: 0.9449462365591398 key: train_recall value: [0.95683453 0.96402878 0.97122302 0.94244604 0.98920863 0.99280576 0.99280576 0.88848921 0.99280576 0.98566308] mean value: 0.9676310564451664 key: test_roc_auc value: [0.9516129 0.93548387 0.96774194 0.9516129 0.96774194 0.93548387 0.9516129 0.90322581 0.93333333 0.95053763] mean value: 0.9448387096774193 key: train_roc_auc value: [0.97302158 0.97482014 0.97661871 0.9676259 0.96223022 0.94064748 0.96942446 0.93705036 0.9748975 0.98743586] mean value: 0.9663772208040019 key: test_jcc value: [0.90322581 0.87878788 0.93548387 0.90909091 0.93939394 0.88235294 0.91176471 0.80645161 0.88571429 0.90322581] mean value: 0.895549175682003 key: train_jcc value: [0.94661922 0.95035461 0.9540636 0.93571429 0.92905405 0.89320388 0.94197952 0.87588652 0.95172414 0.9751773 ] mean value: 0.9353777144417266 MCC on Blind test: 0.17 Accuracy on Blind test: 0.49 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01973319 0.01927781 0.02288127 0.03501034 0.02319169 0.01984477 0.02170777 0.02149868 0.0256052 0.02394271] mean value: 0.023269343376159667 key: score_time value: [0.01226687 0.01212335 0.01218319 0.01252127 0.01216912 0.01212978 0.01235199 0.01232243 0.01220965 0.01219463] mean value: 0.012247228622436523 key: test_mcc value: [0.90748521 0.7130241 0.90748521 0.90748521 0.93743687 0.67419986 0.78446454 0.81325006 0.81870035 0.80983045] mean value: 0.827336186938833 key: train_mcc value: [0.96425338 0.88767604 0.89932729 0.96405373 0.83603758 0.77906124 0.91078521 0.78665297 0.8371636 0.90900083] mean value: 0.8774011859965327 key: test_accuracy value: [0.9516129 0.85483871 0.9516129 0.9516129 0.96774194 0.82258065 0.88709677 0.90322581 0.90163934 0.90163934] mean value: 0.9093601269169752 key: train_accuracy value: [0.98201439 0.94244604 0.94784173 0.98201439 0.91366906 0.87769784 0.95503597 0.88309353 0.91202873 0.95332136] mean value: 0.9349163039406895 key: test_fscore value: [0.95384615 0.84745763 0.95384615 0.95384615 0.96666667 0.84507042 0.87719298 0.90909091 0.91176471 0.89285714] mean value: 0.9111638918145528 key: train_fscore value: [0.98220641 0.94007491 0.95008606 0.98194946 0.90697674 0.89102564 0.95412844 0.89499192 0.91900826 0.95167286] mean value: 0.9372120704015114 key: test_precision value: [0.91176471 0.89285714 0.91176471 0.91176471 1. 0.75 0.96153846 0.85714286 0.83783784 0.96153846] mean value: 0.899620887856182 key: train_precision value: [0.97183099 0.98046875 0.91089109 0.98550725 0.98319328 0.80346821 0.97378277 0.81231672 0.85015291 0.98841699] mean value: 0.9260028937498493 key: test_recall value: [1. 0.80645161 1. 1. 0.93548387 0.96774194 0.80645161 0.96774194 1. 0.83333333] mean value: 0.9317204301075269 key: train_recall value: [0.99280576 0.9028777 0.99280576 0.97841727 0.84172662 1. 0.9352518 0.99640288 1. 0.91756272] mean value: 0.955785049379851 key: test_roc_auc value: [0.9516129 0.85483871 0.9516129 0.9516129 0.96774194 0.82258065 0.88709677 0.90322581 0.9 0.90053763] mean value: 0.9090860215053763 key: train_roc_auc value: [0.98201439 0.94244604 0.94784173 0.98201439 0.91366906 0.87769784 0.95503597 0.88309353 0.91218638 0.95338568] mean value: 0.9349385008122534 key: test_jcc value: [0.91176471 0.73529412 0.91176471 0.91176471 0.93548387 0.73170732 0.78125 0.83333333 0.83783784 0.80645161] mean value: 0.8396652207409427 key: train_jcc value: [0.96503497 0.8869258 0.90491803 0.96453901 0.82978723 0.80346821 0.9122807 0.80994152 0.85015291 0.90780142] mean value: 0.8834849787962806 MCC on Blind test: 0.21 Accuracy on Blind test: 0.57 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.21634364 0.19705439 0.20209169 0.20247054 0.18944693 0.18858314 0.19010186 0.1894753 0.19017911 0.19933033] mean value: 0.19650769233703613 key: score_time value: [0.01674056 0.01665401 0.01689458 0.01660991 0.01544285 0.01575398 0.01562691 0.01547384 0.01630163 0.01683712] mean value: 0.016233539581298827 key: test_mcc value: [1. 0.93743687 0.96824584 0.90369611 1. 0.96824584 0.87278605 0.93743687 0.96770777 0.96770777] mean value: 0.9523263113907507 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.96774194 0.98387097 0.9516129 1. 0.98387097 0.93548387 0.96774194 0.98360656 0.98360656] mean value: 0.975753569539926 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.96875 0.98360656 0.95238095 1. 0.98360656 0.93333333 0.96666667 0.98412698 0.98305085] mean value: 0.9755521898719661 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.93939394 1. 0.9375 1. 1. 0.96551724 1. 0.96875 1. ] mean value: 0.981116118077325 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.96774194 0.96774194 1. 0.96774194 0.90322581 0.93548387 1. 0.96666667] mean value: 0.9708602150537634 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.96774194 0.98387097 0.9516129 1. 0.98387097 0.93548387 0.96774194 0.98333333 0.98333333] mean value: 0.9756989247311828 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.93939394 0.96774194 0.90909091 1. 0.96774194 0.875 0.93548387 0.96875 0.96666667] mean value: 0.9529869257086999 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.09 Accuracy on Blind test: 0.2 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.07209158 0.07015896 0.07557607 0.07624006 0.07895207 0.08325553 0.08903265 0.07233906 0.08569217 0.08808112] mean value: 0.07914192676544189 key: score_time value: [0.02525473 0.02876973 0.03141665 0.02610064 0.03126097 0.03248501 0.04130626 0.03619337 0.0269568 0.04313469] mean value: 0.032287883758544925 key: test_mcc value: [1. 0.90369611 1. 0.93548387 0.96824584 0.90748521 0.96824584 0.96824584 1. 0.87613871] mean value: 0.9527541422538592 key: train_mcc value: [0.99283145 0.99283145 0.99640932 0.99640932 0.98207157 0.98921503 0.99640932 0.99640932 0.99284416 0.99284434] mean value: 0.9928275300190849 key: test_accuracy value: [1. 0.9516129 1. 0.96774194 0.98387097 0.9516129 0.98387097 0.98387097 1. 0.93442623] mean value: 0.9757006874669487 key: train_accuracy value: [0.99640288 0.99640288 0.99820144 0.99820144 0.99100719 0.99460432 0.99820144 0.99820144 0.99640934 0.99640934] mean value: 0.9964041693036954 key: test_fscore value: [1. 0.95081967 1. 0.96774194 0.98360656 0.94915254 0.98412698 0.98360656 1. 0.92857143] mean value: 0.9747625677440411 key: train_fscore value: [0.99638989 0.99638989 0.9981982 0.9981982 0.99095841 0.99459459 0.99820467 0.9981982 0.99638989 0.99640288] mean value: 0.9963924818520766 key: test_precision value: [1. 0.96666667 1. 0.96774194 1. 1. 0.96875 1. 1. 1. ] mean value: 0.9903158602150538 key: train_precision value: [1. 1. 1. 1. 0.99636364 0.99638989 0.99641577 1. 1. 1. ] mean value: 0.9989169298669707 key: test_recall value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] [1. 0.93548387 1. 0.96774194 0.96774194 0.90322581 1. 0.96774194 1. 0.86666667] mean value: 0.9608602150537634 key: train_recall value: [0.99280576 0.99280576 0.99640288 0.99640288 0.98561151 0.99280576 1. 0.99640288 0.99280576 0.99283154] mean value: 0.9938874706686264 key: test_roc_auc value: [1. 0.9516129 1. 0.96774194 0.98387097 0.9516129 0.98387097 0.98387097 1. 0.93333333] mean value: 0.9755913978494624 key: train_roc_auc value: [0.99640288 0.99640288 0.99820144 0.99820144 0.99100719 0.99460432 0.99820144 0.99820144 0.99640288 0.99641577] mean value: 0.9964041669889895 key: test_jcc value: [1. 0.90625 1. 0.9375 0.96774194 0.90322581 0.96875 0.96774194 1. 0.86666667] mean value: 0.9517876344086021 key: train_jcc value: [0.99280576 0.99280576 0.99640288 0.99640288 0.98207885 0.98924731 0.99641577 0.99640288 0.99280576 0.99283154] mean value: 0.9928199375983084 MCC on Blind test: 0.02 Accuracy on Blind test: 0.21 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.22249794 0.21834207 0.20632315 0.21552396 0.18451118 0.13607502 0.16399002 0.21262717 0.26152039 0.24413514] mean value: 0.20655460357666017 key: score_time value: [0.02721357 0.02944684 0.02703238 0.02707863 0.01634979 0.01623416 0.01634526 0.03195477 0.03438377 0.0317266 ] mean value: 0.02577657699584961 key: test_mcc value: [0.80813523 0.64549722 0.87096774 0.87096774 0.78446454 0.78446454 0.71004695 0.74348441 0.67858574 0.84710837] mean value: 0.7743722483793141 key: train_mcc value: [0.96402878 0.95683453 0.96763216 0.97124816 0.96405373 0.97124816 0.97124816 0.97841727 0.96769036 0.97127459] mean value: 0.9683675887070088 key: test_accuracy value: [0.90322581 0.82258065 0.93548387 0.93548387 0.88709677 0.88709677 0.85483871 0.87096774 0.83606557 0.91803279] mean value: 0.8850872554204124 key: train_accuracy value: [0.98201439 0.97841727 0.98381295 0.98561151 0.98201439 0.98561151 0.98561151 0.98920863 0.98384201 0.98563734] mean value: 0.9841781511953812 key: test_fscore value: [0.9 0.82539683 0.93548387 0.93548387 0.87719298 0.87719298 0.85714286 0.86666667 0.82758621 0.90909091] mean value: 0.8811237172041574 key: train_fscore value: [0.98201439 0.97841727 0.98384201 0.98555957 0.98194946 0.98555957 0.98555957 0.98920863 0.98384201 0.98566308] mean value: 0.9841615550595811 key: test_precision value: [0.93103448 0.8125 0.93548387 0.93548387 0.96153846 0.96153846 0.84375 0.89655172 0.88888889 1. ] mean value: 0.9166769760797847 key: train_precision value: [0.98201439 0.97841727 0.98207885 0.98913043 0.98550725 0.98913043 0.98913043 0.98920863 0.98207885 0.98566308] mean value: 0.9852359627024888 key: test_recall value: [0.87096774 0.83870968 0.93548387 0.93548387 0.80645161 0.80645161 0.87096774 0.83870968 0.77419355 0.83333333] mean value: 0.8510752688172043 key: train_recall value: [0.98201439 0.97841727 0.98561151 0.98201439 0.97841727 0.98201439 0.98201439 0.98920863 0.98561151 0.98566308] mean value: 0.983098682344447 key: test_roc_auc value: [0.90322581 0.82258065 0.93548387 0.93548387 0.88709677 0.88709677 0.85483871 0.87096774 0.83709677 0.91666667] mean value: 0.8850537634408603 key: train_roc_auc value: [0.98201439 0.97841727 0.98381295 0.98561151 0.98201439 0.98561151 0.98561151 0.98920863 0.98384518 0.9856373 ] mean value: 0.9841784636806684 key: test_jcc value: [0.81818182 0.7027027 0.87878788 0.87878788 0.78125 0.78125 0.75 0.76470588 0.70588235 0.83333333] mean value: 0.7894881847087729 key: train_jcc value: [0.96466431 0.95774648 0.96819788 0.97153025 0.96453901 0.97153025 0.97153025 0.97864769 0.96819788 0.97173145] mean value: 0.9688315439563768 MCC on Blind test: 0.19 Accuracy on Blind test: 0.49 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.76334548 0.75004768 0.75521564 0.75695348 0.74053025 0.74805856 0.72955751 0.7617662 0.77704668 0.75503945] mean value: 0.7537560939788819 key: score_time value: [0.0095737 0.00933933 0.00931716 0.00973773 0.00978923 0.00931406 0.00945735 0.01039243 0.01002336 0.00945759] mean value: 0.009640192985534668 key: test_mcc value: [1. 0.90369611 1. 0.90369611 1. 0.93743687 1. 0.96824584 0.96770777 0.90586325] mean value: 0.9586645958228902 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [1. 0.9516129 1. 0.9516129 1. 0.96774194 1. 0.98387097 0.98360656 0.95081967] mean value: 0.9789264939185616 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [1. 0.95238095 1. 0.95238095 1. 0.96666667 1. 0.98360656 0.98412698 0.94736842] mean value: 0.9786530533985236 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.9375 1. 0.9375 1. 1. 1. 1. 0.96875 1. ] mean value: 0.984375 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 0.96774194 1. 0.96774194 1. 0.93548387 1. 0.96774194 1. 0.9 ] mean value: 0.9738709677419355 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [1. 0.9516129 1. 0.9516129 1. 0.96774194 1. 0.98387097 0.98333333 0.95 ] mean value: 0.9788172043010753 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [1. 0.90909091 1. 0.90909091 1. 0.93548387 1. 0.96774194 0.96875 0.9 ] mean value: 0.9590157624633431 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03594589 0.03572512 0.03616953 0.03602529 0.03524661 0.03499389 0.03544331 0.03508949 0.03510737 0.0351696 ] mean value: 0.03549160957336426 key: score_time value: [0.01282716 0.0128293 0.01287484 0.01278615 0.01273584 0.01271701 0.01275468 0.01284575 0.01267576 0.01275253] mean value: 0.012779903411865235 key: test_mcc value: [0.55205245 0.49319696 0.61807005 0.57935845 0.74161985 0.61807005 0.54953196 0.54953196 0.74352218 0.47526882] mean value: 0.5920222715118862 key: train_mcc value: [0.7987718 0.83577199 0.79337932 0.7147514 0.73496 0.84543222 0.86422693 0.83249324 0.86463537 0.80414275] mean value: 0.8088565031567613 key: test_accuracy value: [0.75806452 0.74193548 0.80645161 0.77419355 0.85483871 0.80645161 0.77419355 0.77419355 0.86885246 0.73770492] mean value: 0.7896879957694342 key: train_accuracy value: [0.89028777 0.9118705 0.89388489 0.8381295 0.85071942 0.92086331 0.92805755 0.91546763 0.93177738 0.89766607] mean value: 0.897872402257727 key: test_fscore value: [0.70588235 0.71428571 0.81818182 0.73076923 0.83018868 0.79310345 0.76666667 0.76666667 0.87878788 0.73333333] mean value: 0.7737865789153631 key: train_fscore value: [0.87726358 0.90373281 0.89983022 0.80686695 0.82452431 0.91698113 0.92277992 0.91280148 0.9298893 0.88974855] mean value: 0.8884418264619824 key: test_precision value: [0.9 0.8 0.77142857 0.9047619 1. 0.85185185 0.79310345 0.79310345 0.82857143 0.73333333] mean value: 0.8376153986498814 key: train_precision value: [0.99543379 0.995671 0.85209003 1. 1. 0.96428571 0.99583333 0.94252874 0.95454545 0.96638655] mean value: 0.9666774610198209 key: test_recall value: [0.58064516 0.64516129 0.87096774 0.61290323 0.70967742 0.74193548 0.74193548 0.74193548 0.93548387 0.73333333] mean value: 0.7313978494623656 key: train_recall value: [0.78417266 0.82733813 0.95323741 0.67625899 0.70143885 0.87410072 0.85971223 0.88489209 0.90647482 0.82437276] mean value: 0.8291998659137206 key: test_roc_auc value: [0.75806452 0.74193548 0.80645161 0.77419355 0.85483871 0.80645161 0.77419355 0.77419355 0.86774194 0.73763441] mean value: 0.7895698924731183 key: train_roc_auc value: [0.89028777 0.9118705 0.89388489 0.8381295 0.85071942 0.92086331 0.92805755 0.91546763 0.93173203 0.89779789] mean value: 0.8978810499987108 key: test_jcc value: [0.54545455 0.55555556 0.69230769 0.57575758 0.70967742 0.65714286 0.62162162 0.62162162 0.78378378 0.57894737] mean value: 0.6341870041021145 key: train_jcc value: [0.78136201 0.82437276 0.81790123 0.67625899 0.70143885 0.8466899 0.85663082 0.83959044 0.86896552 0.80139373] mean value: 0.8014604252313136 MCC on Blind test: 0.03 Accuracy on Blind test: 0.46 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.01635957 0.01777172 0.03165007 0.04656363 0.04724741 0.0389545 0.03878331 0.03752255 0.03754044 0.04346228] mean value: 0.035585546493530275 key: score_time value: [0.01368904 0.0122335 0.01877952 0.02584672 0.03397083 0.0331285 0.0305109 0.02508545 0.02114463 0.02042747] mean value: 0.02348165512084961 key: test_mcc value: [0.90369611 0.87278605 0.90369611 0.84983659 0.93743687 0.93548387 0.90369611 0.87278605 0.9344086 0.83655914] mean value: 0.8950385503763444 key: train_mcc value: [0.94634322 0.9393413 0.93958474 0.95693359 0.93238486 0.95339163 0.93585746 0.95025527 0.94663736 0.94994909] mean value: 0.945067852011007 key: test_accuracy value: [0.9516129 0.93548387 0.9516129 0.91935484 0.96774194 0.96774194 0.9516129 0.93548387 0.96721311 0.91803279] mean value: 0.9465891062929667 key: train_accuracy value: [0.97302158 0.96942446 0.96942446 0.97841727 0.96582734 0.97661871 0.9676259 0.97482014 0.97307002 0.97486535] mean value: 0.9723115224158196 key: test_fscore value: [0.95238095 0.9375 0.95238095 0.92537313 0.96875 0.96774194 0.95081967 0.93333333 0.96774194 0.91803279] mean value: 0.9474054702407732 key: train_fscore /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:155: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:158: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) value: [0.97335702 0.9699115 0.97001764 0.97857143 0.9664903 0.97682709 0.96819788 0.97526502 0.97345133 0.9751773 ] mean value: 0.9727266509888757 key: test_precision value: [0.9375 0.90909091 0.9375 0.86111111 0.93939394 0.96774194 0.96666667 0.96551724 0.96774194 0.90322581] mean value: 0.9355489545061292 key: train_precision value: [0.96140351 0.95470383 0.95155709 0.97163121 0.94809689 0.96819788 0.95138889 0.95833333 0.95818815 0.96491228] mean value: 0.9588413062529795 key: test_recall value: [0.96774194 0.96774194 0.96774194 1. 1. 0.96774194 0.93548387 0.90322581 0.96774194 0.93333333] mean value: 0.9610752688172043 key: train_recall value: [0.98561151 0.98561151 0.98920863 0.98561151 0.98561151 0.98561151 0.98561151 0.99280576 0.98920863 0.98566308] mean value: 0.9870555168768211 key: test_roc_auc value: [0.9516129 0.93548387 0.9516129 0.91935484 0.96774194 0.96774194 0.9516129 0.93548387 0.9672043 0.91827957] mean value: 0.9466129032258065 key: train_roc_auc value: [0.97302158 0.96942446 0.96942446 0.97841727 0.96582734 0.97661871 0.9676259 0.97482014 0.97309894 0.97484593] mean value: 0.9723124726025631 key: test_jcc value: [0.90909091 0.88235294 0.90909091 0.86111111 0.93939394 0.9375 0.90625 0.875 0.9375 0.84848485] mean value: 0.9005774658348188 key: train_jcc value: [0.94809689 0.94158076 0.94178082 0.95804196 0.93515358 0.95470383 0.93835616 0.95172414 0.94827586 0.95155709] mean value: 0.9469271095966189 MCC on Blind test: 0.14 Accuracy on Blind test: 0.45 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.21475935 0.18135929 0.22929859 0.29797983 0.30646324 0.30647278 0.35200739 0.39313698 0.30069876 0.32392144] mean value: 0.2906097650527954 key: score_time value: [0.01228642 0.01898122 0.01878119 0.01901627 0.0246706 0.01887894 0.01930165 0.01910377 0.02126479 0.01894569] mean value: 0.019123053550720213 key: test_mcc value: [0.83914639 0.87278605 0.96824584 0.84983659 0.96824584 0.90369611 0.90369611 0.84266484 0.9344086 0.83655914] mean value: 0.8919285509341773 key: train_mcc value: [0.95003374 0.96412858 0.94283651 0.95693359 0.94283651 0.96048758 0.93585746 0.95723096 0.96419362 0.94994909] mean value: 0.9524487657494496 key: test_accuracy value: [0.91935484 0.93548387 0.98387097 0.91935484 0.98387097 0.9516129 0.9516129 0.91935484 0.96721311 0.91803279] mean value: 0.9449762030671602 key: train_accuracy value: [0.97482014 0.98201439 0.97122302 0.97841727 0.97122302 0.98021583 0.9676259 0.97841727 0.98204668 0.97486535] mean value: 0.9760868863257688 key: test_fscore value: [0.92063492 0.9375 0.98412698 0.92537313 0.98360656 0.95081967 0.95081967 0.91525424 0.96774194 0.91803279] mean value: 0.945390990038686 key: train_fscore value: [0.9751773 0.98214286 0.97163121 0.97857143 0.97163121 0.980322 0.96819788 0.9787234 0.98214286 0.9751773 ] mean value: 0.9763717451825532 key: test_precision value: [0.90625 0.90909091 0.96875 0.86111111 1. 0.96666667 0.96666667 0.96428571 0.96774194 0.90322581] mean value: 0.9413788809756551 key: train_precision value: [0.96153846 0.9751773 0.95804196 0.97163121 0.95804196 0.97508897 0.95138889 0.96503497 0.9751773 0.96491228] mean value: 0.9656033295822353 key: test_recall value: [0.93548387 0.96774194 1. 1. 0.96774194 0.93548387 0.93548387 0.87096774 0.96774194 0.93333333] mean value: 0.9513978494623656 key: train_recall value: [0.98920863 0.98920863 0.98561151 0.98561151 0.98561151 0.98561151 0.98561151 0.99280576 0.98920863 0.98566308] mean value: 0.9874152291070369 key: test_roc_auc value: [0.91935484 0.93548387 0.98387097 0.91935484 0.98387097 0.9516129 0.9516129 0.91935484 0.9672043 0.91827957] mean value: 0.9450000000000001 key: train_roc_auc value: [0.97482014 0.98201439 0.97122302 0.97841727 0.97122302 0.98021583 0.9676259 0.97841727 0.98205951 0.97484593] mean value: 0.97608622779196 key: test_jcc value: [0.85294118 0.88235294 0.96875 0.86111111 0.96774194 0.90625 0.90625 0.84375 0.9375 0.84848485] mean value: 0.897513201272689 key: train_jcc value: [0.95155709 0.96491228 0.94482759 0.95804196 0.94482759 0.96140351 0.93835616 0.95833333 0.96491228 0.95155709] mean value: 0.9538728885199296 MCC on Blind test: 0.14 Accuracy on Blind test: 0.41 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03412819 0.0332408 0.03413343 0.03263998 0.03388739 0.03435946 0.02902317 0.03443003 0.03482413 0.03413296] mean value: 0.033479952812194826 key: score_time value: [0.01193094 0.01195025 0.01777911 0.01195312 0.01445174 0.01460767 0.01191902 0.01462698 0.01480126 0.01203799] mean value: 0.013605809211730957 key: test_mcc value: [0.75592895 0.75592895 0.81409158 0.82717019 0.81409158 0.81409158 0.80833333 0.9372467 0.68826048 0.6778302 ] mean value: 0.7892973529226506 key: train_mcc value: [0.86725157 0.8612933 0.84641474 0.89492115 0.86052165 0.86794223 0.84766497 0.84766497 0.86842762 0.87508713] mean value: 0.8637189322784831 key: test_accuracy value: [0.875 0.875 0.90625 0.90625 0.90625 0.90625 0.90322581 0.96774194 0.83870968 0.83870968] mean value: 0.8923387096774194 key: train_accuracy value: [0.93309859 0.92957746 0.92253521 0.9471831 0.92957746 0.93309859 0.92280702 0.92280702 0.93333333 0.93684211] mean value: 0.9310859896219421 key: test_fscore value: [0.88235294 0.88235294 0.90909091 0.91428571 0.90909091 0.90909091 0.90322581 0.96551724 0.85714286 0.84848485] mean value: 0.8980635077370012 key: train_fscore value: [0.9347079 0.93197279 0.92465753 0.94809689 0.93150685 0.93515358 0.92567568 0.92567568 0.93515358 0.93835616] mean value: 0.9330956645240915 key: test_precision value: [0.83333333 0.83333333 0.88235294 0.84210526 0.88235294 0.88235294 0.875 1. 0.78947368 0.82352941] mean value: 0.8643833849329206 key: train_precision value: [0.91275168 0.90131579 0.9 0.93197279 0.90666667 0.90728477 0.89542484 0.89542484 0.90728477 0.91333333] mean value: 0.9071459466068135 key: test_recall value: [0.9375 0.9375 0.9375 1. 0.9375 0.9375 0.93333333 0.93333333 0.9375 0.875 ] mean value: 0.9366666666666666 key: train_recall value: [0.95774648 0.96478873 0.95070423 0.96478873 0.95774648 0.96478873 0.95804196 0.95804196 0.96478873 0.96478873] mean value: 0.9606224761154338 key: test_roc_auc value: [0.875 0.875 0.90625 0.90625 0.90625 0.90625 0.90416667 0.96666667 0.83541667 0.8375 ] mean value: 0.891875 key: train_roc_auc value: [0.93309859 0.92957746 0.92253521 0.9471831 0.92957746 0.93309859 0.92268295 0.92268295 0.93344332 0.93693982] mean value: 0.9310819462227913 key: test_jcc value: [0.78947368 0.78947368 0.83333333 0.84210526 0.83333333 0.83333333 0.82352941 0.93333333 0.75 0.73684211] mean value: 0.8164757481940145 key: train_jcc value: [0.87741935 0.87261146 0.85987261 0.90131579 0.87179487 0.87820513 0.86163522 0.86163522 0.87820513 0.88387097] mean value: 0.8746565756944151 MCC on Blind test: 0.2 Accuracy on Blind test: 0.54 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.91829085 0.83430195 0.91920638 0.83813977 0.83696795 0.96647 0.8144691 0.92707682 0.77499366 0.80930591] mean value: 0.8639222383499146 key: score_time value: [0.01481414 0.01517582 0.01258183 0.0154736 0.01538038 0.01537609 0.01545954 0.01542616 0.01332378 0.01214576] mean value: 0.01451570987701416 key: test_mcc value: [0.75592895 0.75592895 0.93933644 0.8819171 0.93933644 0.81409158 0.74166667 0.82078268 0.74689528 0.74896053] mean value: 0.8144844609786615 key: train_mcc value: [0.99298237 0.98591549 0.98591549 1. 0.98591549 0.98591549 1. 0.98596474 0.90211827 1. ] mean value: 0.9824727346759803 key: test_accuracy value: [0.875 0.875 0.96875 0.9375 0.96875 0.90625 0.87096774 0.90322581 0.87096774 0.87096774] mean value: 0.9047379032258065 key: train_accuracy value: [0.99647887 0.99295775 0.99295775 1. 0.99295775 0.99295775 1. 0.99298246 0.95087719 1. ] mean value: 0.9912169508277736 key: test_fscore value: [0.88235294 0.88235294 0.96969697 0.94117647 0.96969697 0.90322581 0.86666667 0.88888889 0.88235294 0.86666667] mean value: 0.9053077262185422 key: train_fscore value: [0.99649123 0.99295775 0.99295775 1. 0.99295775 0.99295775 1. 0.99300699 0.95138889 1. ] mean value: 0.9912718095881551 key: test_precision value: [0.83333333 0.83333333 0.94117647 0.88888889 0.94117647 0.93333333 0.86666667 1. 0.83333333 0.92857143] mean value: 0.8999813258636788 key: train_precision value: [0.99300699 0.99295775 0.99295775 1. 0.99295775 0.99295775 1. 0.99300699 0.93835616 1. ] mean value: 0.9896201136313041 key: test_recall value: [0.9375 0.9375 1. 1. 1. 0.875 0.86666667 0.8 0.9375 0.8125 ] mean value: 0.9166666666666666 key: train_recall value: [1. 0.99295775 0.99295775 1. 0.99295775 0.99295775 1. 0.99300699 0.96478873 1. ] mean value: 0.9929626711316852 key: test_roc_auc value: [0.875 0.875 0.96875 0.9375 0.96875 0.90625 0.87083333 0.9 0.86875 0.87291667] mean value: 0.904375 key: train_roc_auc value: [0.99647887 0.99295775 0.99295775 1. 0.99295775 0.99295775 1. 0.99298237 0.95092583 1. ] mean value: 0.9912218063626514 key: test_jcc value: [0.78947368 0.78947368 0.94117647 0.88888889 0.94117647 0.82352941 0.76470588 0.8 0.78947368 0.76470588] mean value: 0.8292604059167527 key: train_jcc value: [0.99300699 0.98601399 0.98601399 1. 0.98601399 0.98601399 1. 0.98611111 0.90728477 1. ] mean value: 0.9830458816385969 MCC on Blind test: 0.1 Accuracy on Blind test: 0.43 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01387405 0.00984526 0.01041555 0.0104239 0.00973701 0.00974059 0.01031542 0.00996995 0.00954819 0.01026106] mean value: 0.010413098335266113 key: score_time value: [0.00935388 0.0093286 0.00918818 0.00929189 0.00887823 0.0089097 0.00947213 0.00943756 0.00906706 0.00956702] mean value: 0.009249424934387207 key: test_mcc value: [0.57265629 0.44539933 0.67419986 0.31311215 0.37796447 0.69991324 0.6125 0.5612264 0.54812195 0.48527095] mean value: 0.5290364639771098 key: train_mcc value: [0.58060405 0.63028696 0.59100561 0.59378186 0.60524671 0.63786488 0.63536949 0.55859525 0.64298155 0.60682055] mean value: 0.6082556910309478 key: test_accuracy value: [0.78125 0.71875 0.8125 0.65625 0.6875 0.84375 0.80645161 0.77419355 0.77419355 0.74193548] mean value: 0.7596774193548387 key: train_accuracy value: [0.77464789 0.81338028 0.79225352 0.79225352 0.79929577 0.81690141 0.81754386 0.7754386 0.81754386 0.8 ] mean value: 0.7999258710155671 key: test_fscore value: [0.8 0.74285714 0.84210526 0.66666667 0.70588235 0.85714286 0.8 0.78787879 0.78787879 0.76470588] mean value: 0.7755117740876255 key: train_fscore value: [0.80606061 0.82274247 0.80655738 0.80906149 0.81311475 0.82666667 0.81560284 0.79354839 0.83006536 0.81311475] mean value: 0.8136534705016032 key: test_precision value: [0.73684211 0.68421053 0.72727273 0.64705882 0.66666667 0.78947368 0.8 0.72222222 0.76470588 0.72222222] mean value: 0.7260674860055665 key: train_precision value: [0.70744681 0.78343949 0.75460123 0.74850299 0.7607362 0.78481013 0.82733813 0.73652695 0.77439024 0.7607362 ] mean value: 0.763852835868928 key: test_recall value: [0.875 0.8125 1. 0.6875 0.75 0.9375 0.8 0.86666667 0.8125 0.8125 ] mean value: 0.8354166666666667 key: train_recall value: [0.93661972 0.86619718 0.86619718 0.88028169 0.87323944 0.87323944 0.8041958 0.86013986 0.8943662 0.87323944] mean value: 0.8727715946025805 key: test_roc_auc value: [0.78125 0.71875 0.8125 0.65625 0.6875 0.84375 0.80625 0.77708333 0.77291667 0.73958333] mean value: 0.7595833333333334 key: train_roc_auc value: [0.77464789 0.81338028 0.79225352 0.79225352 0.79929577 0.81690141 0.81759086 0.77514035 0.81781247 0.80025608] mean value: 0.7999532157982862 key: test_jcc value: [0.66666667 0.59090909 0.72727273 0.5 0.54545455 0.75 0.66666667 0.65 0.65 0.61904762] mean value: 0.6366017316017316 key: train_jcc value: [0.6751269 0.69886364 0.67582418 0.67934783 0.68508287 0.70454545 0.68862275 0.65775401 0.70949721 0.68508287] mean value: 0.6859747714119993 MCC on Blind test: 0.19 Accuracy on Blind test: 0.46 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.00973558 0.00966954 0.01009893 0.00950837 0.00984263 0.01027656 0.00975704 0.01026225 0.01067615 0.01004601] mean value: 0.009987306594848634 key: score_time value: [0.00911403 0.00956821 0.00901866 0.00881863 0.00951982 0.00912619 0.00956416 0.00889802 0.00927353 0.00916696] mean value: 0.009206819534301757 key: test_mcc value: [0.62994079 0.51639778 0.72374686 0.69991324 0.56360186 0.56360186 0.6125 0.6125 0.54812195 0.48333333] mean value: 0.5953657681593737 key: train_mcc value: [0.72009768 0.71945253 0.65494582 0.71270053 0.69351968 0.67848335 0.70693066 0.70025076 0.70041244 0.73046876] mean value: 0.7017262203984271 key: test_accuracy value: [0.8125 0.75 0.84375 0.84375 0.78125 0.78125 0.80645161 0.80645161 0.77419355 0.74193548] mean value: 0.7941532258064516 key: train_accuracy value: [0.85915493 0.85915493 0.82746479 0.8556338 0.84507042 0.83802817 0.85263158 0.84912281 0.84912281 0.86315789] mean value: 0.8498542129972819 key: test_fscore value: [0.82352941 0.77777778 0.86486486 0.85714286 0.77419355 0.77419355 0.8 0.8 0.78787879 0.75 ] mean value: 0.8009580796203187 key: train_fscore value: [0.86394558 0.8630137 0.82807018 0.86006826 0.85234899 0.84459459 0.85810811 0.85521886 0.85423729 0.86956522] mean value: 0.8549170768422738 key: test_precision value: [0.77777778 0.7 0.76190476 0.78947368 0.8 0.8 0.8 0.8 0.76470588 0.75 ] mean value: 0.7743862106246007 key: train_precision value: [0.83552632 0.84 0.82517483 0.83443709 0.81410256 0.81168831 0.83006536 0.82467532 0.82352941 0.82802548] mean value: 0.8267224676472051 key: test_recall value: [0.875 0.875 1. 0.9375 0.75 0.75 0.8 0.8 0.8125 0.75 ] mean value: 0.835 key: train_recall value: [0.8943662 0.88732394 0.83098592 0.88732394 0.8943662 0.88028169 0.88811189 0.88811189 0.88732394 0.91549296] mean value: 0.885368856495617 key: test_roc_auc value: [0.8125 0.75 0.84375 0.84375 0.78125 0.78125 0.80625 0.80625 0.77291667 0.74166667] mean value: 0.7939583333333333 key: train_roc_auc value: [0.85915493 0.85915493 0.82746479 0.8556338 0.84507042 0.83802817 0.85250665 0.84898552 0.84925638 0.86334088] mean value: 0.8498596473948588 key: test_jcc value: [0.7 0.63636364 0.76190476 0.75 0.63157895 0.63157895 0.66666667 0.66666667 0.65 0.6 ] mean value: 0.6694759626338573 key: train_jcc value: [0.76047904 0.75903614 0.70658683 0.75449102 0.74269006 0.73099415 0.75147929 0.74705882 0.74556213 0.76923077] mean value: 0.7467608254210698 MCC on Blind test: 0.25 Accuracy on Blind test: 0.54 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01051188 0.01002789 0.0098722 0.00999093 0.00889993 0.00897336 0.00993133 0.01016641 0.00993896 0.00988293] mean value: 0.009819579124450684 key: score_time value: [0.0114634 0.01155281 0.01165009 0.01547885 0.01105022 0.01161242 0.0118773 0.01145554 0.01150084 0.01167846] mean value: 0.011931991577148438 key: test_mcc value: [0.56360186 0.32897585 0.77459667 0.438357 0.438357 0.62994079 0.50443936 0.28870546 0.225 0.48954403] mean value: 0.46815180211506136 key: train_mcc value: [0.59207807 0.6479516 0.61342184 0.66916344 0.64445071 0.61452264 0.61517352 0.64330646 0.60128363 0.61483888] mean value: 0.6256190785735106 key: test_accuracy value: [0.78125 0.65625 0.875 0.71875 0.71875 0.8125 0.74193548 0.64516129 0.61290323 0.74193548] mean value: 0.7304435483870968 key: train_accuracy value: [0.79577465 0.82394366 0.80633803 0.83450704 0.82042254 0.80633803 0.80701754 0.82105263 0.8 0.80701754] mean value: 0.8122411662960217 key: test_fscore value: [0.78787879 0.7027027 0.88888889 0.72727273 0.72727273 0.82352941 0.76470588 0.62068966 0.625 0.73333333] mean value: 0.7401274116639228 key: train_fscore value: [0.8 0.82517483 0.81099656 0.83623693 0.82943144 0.81355932 0.81355932 0.82711864 0.80546075 0.81099656] mean value: 0.8172534363236427 key: test_precision value: [0.76470588 0.61904762 0.8 0.70588235 0.70588235 0.77777778 0.68421053 0.64285714 0.625 0.78571429] mean value: 0.711107793994791 key: train_precision value: [0.78378378 0.81944444 0.79194631 0.82758621 0.78980892 0.78431373 0.78947368 0.80263158 0.78145695 0.79194631] mean value: 0.7962391912062372 key: test_recall value: [0.8125 0.8125 1. 0.75 0.75 0.875 0.86666667 0.6 0.625 0.6875 ] mean value: 0.7779166666666667 key: train_recall value: [0.81690141 0.83098592 0.83098592 0.84507042 0.87323944 0.84507042 0.83916084 0.85314685 0.83098592 0.83098592] mean value: 0.8396533044420368 key: test_roc_auc value: [0.78125 0.65625 0.875 0.71875 0.71875 0.8125 0.74583333 0.64375 0.6125 0.74375 ] mean value: 0.7308333333333333 key: train_roc_auc value: [0.79577465 0.82394366 0.80633803 0.83450704 0.82042254 0.80633803 0.80690436 0.82093962 0.80010834 0.80710135] mean value: 0.8122377622377622 key: test_jcc value: [0.65 0.54166667 0.8 0.57142857 0.57142857 0.7 0.61904762 0.45 0.45454545 0.57894737] mean value: 0.5937064251537936 key: train_jcc value: [0.66666667 0.70238095 0.68208092 0.71856287 0.70857143 0.68571429 0.68571429 0.70520231 0.67428571 0.68208092] mean value: 0.6911260369434541 MCC on Blind test: 0.15 Accuracy on Blind test: 0.54 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.0170126 0.0137372 0.01393175 0.01377177 0.01376629 0.01400828 0.01391697 0.01411223 0.01390743 0.01402569] mean value: 0.014219021797180176 key: score_time value: [0.01044726 0.01004076 0.00990629 0.00992775 0.01005983 0.01006055 0.01024079 0.00993419 0.00997138 0.01006269] mean value: 0.010065150260925294 key: test_mcc value: [0.75592895 0.64549722 0.8819171 0.64549722 0.625 0.69991324 0.61925228 0.87083333 0.55573827 0.74689528] mean value: 0.704647291079871 key: train_mcc value: [0.8015394 0.8015394 0.8015394 0.82090085 0.79514657 0.79667392 0.80213695 0.78160256 0.82281252 0.78329205] mean value: 0.8007183608777234 key: test_accuracy value: [0.875 0.8125 0.9375 0.8125 0.8125 0.84375 0.80645161 0.93548387 0.77419355 0.87096774] mean value: 0.8480846774193548 key: train_accuracy value: [0.89788732 0.89788732 0.89788732 0.9084507 0.8943662 0.8943662 0.89824561 0.8877193 0.90877193 0.8877193 ] mean value: 0.8973301210773412 key: test_fscore value: [0.88235294 0.83333333 0.94117647 0.83333333 0.8125 0.85714286 0.8125 0.93333333 0.8 0.88235294] mean value: 0.8588025210084034 key: train_fscore value: [0.90365449 0.90365449 0.90365449 0.91275168 0.90066225 0.90131579 0.90429043 0.89473684 0.91333333 0.89473684] mean value: 0.9032790620717928 key: test_precision value: [0.83333333 0.75 0.88888889 0.75 0.8125 0.78947368 0.76470588 0.93333333 0.73684211 0.83333333] mean value: 0.8092410560715514 key: train_precision value: [0.85534591 0.85534591 0.85534591 0.87179487 0.85 0.84567901 0.85625 0.8447205 0.86708861 0.83950617] mean value: 0.854107689731846 key: test_recall value: [0.9375 0.9375 1. 0.9375 0.8125 0.9375 0.86666667 0.93333333 0.875 0.9375 ] mean value: 0.9175 key: train_recall value: [0.95774648 0.95774648 0.95774648 0.95774648 0.95774648 0.96478873 0.95804196 0.95104895 0.96478873 0.95774648] mean value: 0.9585147247119078 key: test_roc_auc value: [0.875 0.8125 0.9375 0.8125 0.8125 0.84375 0.80833333 0.93541667 0.77083333 0.86875 ] mean value: 0.8477083333333334 key: train_roc_auc value: [0.89788732 0.89788732 0.89788732 0.9084507 0.8943662 0.8943662 0.89803506 0.88749631 0.90896779 0.88796415] mean value: 0.8973308381759086 key: test_jcc value: [0.78947368 0.71428571 0.88888889 0.71428571 0.68421053 0.75 0.68421053 0.875 0.66666667 0.78947368] mean value: 0.7556495405179615 key: train_jcc value: [0.82424242 0.82424242 0.82424242 0.83950617 0.81927711 0.82035928 0.8253012 0.80952381 0.8404908 0.80952381] mean value: 0.8236709456850548 MCC on Blind test: 0.26 Accuracy on Blind test: 0.53 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.14150548 1.34942889 1.16798019 1.32275343 1.13871098 1.29670668 1.15652013 1.27996182 1.19484806 1.22055531] mean value: 1.2268970966339112 key: score_time value: [0.01491427 0.01574922 0.01575661 0.01515722 0.01519918 0.015342 0.01536441 0.01537752 0.01538777 0.01550269] mean value: 0.015375089645385743 key: test_mcc value: [0.625 0.75592895 0.875 0.69991324 0.875 0.75 0.80833333 0.74689528 0.61608311 0.6778302 ] mean value: 0.742998411791738 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.8125 0.875 0.9375 0.84375 0.9375 0.875 0.90322581 0.87096774 0.80645161 0.83870968] mean value: 0.8700604838709678 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8125 0.88235294 0.9375 0.85714286 0.9375 0.875 0.90322581 0.85714286 0.82352941 0.84848485] mean value: 0.8734378722163352 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.8125 0.83333333 0.9375 0.78947368 0.9375 0.875 0.875 0.92307692 0.77777778 0.82352941] mean value: 0.8584691130163267 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8125 0.9375 0.9375 0.9375 0.9375 0.875 0.93333333 0.8 0.875 0.875 ] mean value: 0.8920833333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8125 0.875 0.9375 0.84375 0.9375 0.875 0.90416667 0.86875 0.80416667 0.8375 ] mean value: 0.8695833333333334 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.68421053 0.78947368 0.88235294 0.75 0.88235294 0.77777778 0.82352941 0.75 0.7 0.73684211] mean value: 0.7776539387684899 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.22 Accuracy on Blind test: 0.55 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02083063 0.01830387 0.01479101 0.01522541 0.01517844 0.01537442 0.01601005 0.01530504 0.01531887 0.01610589] mean value: 0.01624436378479004 key: score_time value: [0.01181269 0.0092442 0.00874877 0.00870419 0.0087533 0.00873685 0.00871205 0.0087297 0.00876212 0.00880837] mean value: 0.009101223945617676 key: test_mcc value: [0.62994079 0.93933644 0.8819171 1. 1. 0.82717019 0.80753845 1. 0.74166667 0.74166667] mean value: 0.856923630331575 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.8125 0.96875 0.9375 1. 1. 0.90625 0.90322581 1. 0.87096774 0.87096774] mean value: 0.927016129032258 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8 0.96969697 0.94117647 1. 1. 0.89655172 0.89655172 1. 0.875 0.875 ] mean value: 0.9253976888561067 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85714286 0.94117647 0.88888889 1. 1. 1. 0.92857143 1. 0.875 0.875 ] mean value: 0.936577964519141 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.75 1. 1. 1. 1. 0.8125 0.86666667 1. 0.875 0.875 ] mean value: 0.9179166666666667 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8125 0.96875 0.9375 1. 1. 0.90625 0.90208333 1. 0.87083333 0.87083333] mean value: 0.926875 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.66666667 0.94117647 0.88888889 1. 1. 0.8125 0.8125 1. 0.77777778 0.77777778] mean value: 0.8677287581699347 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.03 Accuracy on Blind test: 0.2 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.10153913 0.10321045 0.10332656 0.10254884 0.10229015 0.10297704 0.10236096 0.10254765 0.10236812 0.10320854] mean value: 0.10263774394989014 key: score_time value: [0.01733112 0.0174036 0.0175724 0.0176034 0.01755619 0.01750135 0.01753855 0.01772189 0.01778698 0.01753831] mean value: 0.017555379867553712 key: test_mcc value: [0.625 0.69991324 0.8819171 0.68884672 0.62994079 0.81409158 0.6778302 0.87083333 0.61608311 0.74166667] mean value: 0.7246122745982656 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.8125 0.84375 0.9375 0.84375 0.8125 0.90625 0.83870968 0.93548387 0.80645161 0.87096774] mean value: 0.8607862903225807 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8125 0.85714286 0.94117647 0.84848485 0.8 0.90909091 0.82758621 0.93333333 0.82352941 0.875 ] mean value: 0.8627844037301441 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.8125 0.78947368 0.88888889 0.82352941 0.85714286 0.88235294 0.85714286 0.93333333 0.77777778 0.875 ] mean value: 0.8497141751437417 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8125 0.9375 1. 0.875 0.75 0.9375 0.8 0.93333333 0.875 0.875 ] mean value: 0.8795833333333334 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8125 0.84375 0.9375 0.84375 0.8125 0.90625 0.8375 0.93541667 0.80416667 0.87083333] mean value: 0.8604166666666667 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.68421053 0.75 0.88888889 0.73684211 0.66666667 0.83333333 0.70588235 0.875 0.7 0.77777778] mean value: 0.761860165118679 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.2 Accuracy on Blind test: 0.54 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00992775 0.00947857 0.0094595 0.00942636 0.00943565 0.00957608 0.00951862 0.00950575 0.00959396 0.00936556] mean value: 0.009528779983520507 key: score_time value: [0.00903702 0.00869894 0.00865579 0.00867105 0.00870419 0.00873446 0.0087626 0.00865459 0.00866556 0.00874281] mean value: 0.00873270034790039 key: test_mcc value: [0.38729833 0.31311215 0.57265629 0.37796447 0.19088543 0.56360186 0.29844172 0.4184137 0.61925228 0.6125 ] mean value: 0.4354126237712749 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.6875 0.65625 0.78125 0.6875 0.59375 0.78125 0.64516129 0.70967742 0.80645161 0.80645161] mean value: 0.7155241935483871 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.64285714 0.64516129 0.8 0.66666667 0.55172414 0.78787879 0.56 0.68965517 0.8 0.8125 ] mean value: 0.6956443198070006 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.75 0.66666667 0.73684211 0.71428571 0.61538462 0.76470588 0.7 0.71428571 0.85714286 0.8125 ] mean value: 0.7331813555381667 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.5625 0.625 0.875 0.625 0.5 0.8125 0.46666667 0.66666667 0.75 0.8125 ] mean value: 0.6695833333333333 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.6875 0.65625 0.78125 0.6875 0.59375 0.78125 0.63958333 0.70833333 0.80833333 0.80625 ] mean value: 0.7150000000000001 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.47368421 0.47619048 0.66666667 0.5 0.38095238 0.65 0.38888889 0.52631579 0.66666667 0.68421053] mean value: 0.5413575605680869 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.13 Accuracy on Blind test: 0.56 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.41357183 1.40730119 1.40797234 1.41579556 1.43474579 1.41381311 1.42277074 1.42039752 1.40721416 1.4074707 ] mean value: 1.4151052951812744 key: score_time value: [0.09626412 0.0903163 0.09786963 0.09362054 0.0960052 0.0972116 0.09172726 0.09013462 0.14322925 0.09644318] mean value: 0.09928216934204101 key: test_mcc value: [0.75592895 0.8819171 0.93933644 0.81409158 0.81409158 0.93933644 0.87083333 0.9372467 0.87083333 0.80753845] mean value: 0.8631153893705837 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.875 0.9375 0.96875 0.90625 0.90625 0.96875 0.93548387 0.96774194 0.93548387 0.90322581] mean value: 0.9304435483870968 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88235294 0.94117647 0.96969697 0.90909091 0.90909091 0.96969697 0.93333333 0.96551724 0.9375 0.90909091] mean value: 0.9326546653144017 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.83333333 0.88888889 0.94117647 0.88235294 0.88235294 0.94117647 0.93333333 1. 0.9375 0.88235294] mean value: 0.9122467320261438 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9375 1. 1. 0.9375 0.9375 1. 0.93333333 0.93333333 0.9375 0.9375 ] mean value: 0.9554166666666667 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.875 0.9375 0.96875 0.90625 0.90625 0.96875 0.93541667 0.96666667 0.93541667 0.90208333] mean value: 0.9302083333333333 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( mean value: 1.0 key: test_jcc value: [0.78947368 0.88888889 0.94117647 0.83333333 0.83333333 0.94117647 0.875 0.93333333 0.88235294 0.83333333] mean value: 0.875140178878569 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.07 Accuracy on Blind test: 0.31 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000...05', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.94509768 0.92780328 0.92000961 0.8821938 0.92545223 0.95332265 0.93400884 0.98241067 0.93518734 0.926929 ] mean value: 0.9332415103912354 key: score_time value: [0.22051883 0.21201563 0.21300459 0.25284195 0.23016191 0.20851731 0.26344585 0.20474267 0.2121985 0.26367092] mean value: 0.2281118154525757 key: test_mcc value: [0.75592895 0.82717019 0.81409158 0.81409158 0.8819171 0.8819171 0.87083333 0.9372467 0.67916667 0.80753845] mean value: 0.826990164934043 key: train_mcc value: [0.97221679 0.95129413 0.95812669 0.94450549 0.94403659 0.95129413 0.95826776 0.95145657 0.95146839 0.9582759 ] mean value: 0.954094243100845 key: test_accuracy value: [0.875 0.90625 0.90625 0.90625 0.9375 0.9375 0.93548387 0.96774194 0.83870968 0.90322581] mean value: 0.911391129032258 key: train_accuracy value: [0.98591549 0.97535211 0.97887324 0.97183099 0.97183099 0.97535211 0.97894737 0.9754386 0.9754386 0.97894737] mean value: 0.9767926859402026 key: test_fscore value: [0.88235294 0.91428571 0.90909091 0.90909091 0.94117647 0.94117647 0.93333333 0.96551724 0.83870968 0.90909091] mean value: 0.9143824576043381 key: train_fscore value: [0.98611111 0.97577855 0.97916667 0.97241379 0.97222222 0.97577855 0.97931034 0.97594502 0.97577855 0.97916667] mean value: 0.977167146191824 key: test_precision value: [0.83333333 0.84210526 0.88235294 0.88235294 0.88888889 0.88888889 0.93333333 1. 0.86666667 0.88235294] mean value: 0.8900275197798417 key: train_precision value: [0.97260274 0.95918367 0.96575342 0.9527027 0.95890411 0.95918367 0.96598639 0.95945946 0.95918367 0.96575342] mean value: 0.9618713275758285 key: test_recall value: [0.9375 1. 0.9375 0.9375 1. 1. 0.93333333 0.93333333 0.8125 0.9375 ] mean value: 0.9429166666666666 key: train_recall value: [1. 0.99295775 0.99295775 0.99295775 0.98591549 0.99295775 0.99300699 0.99300699 0.99295775 0.99295775] mean value: 0.9929675957844972 key: test_roc_auc value: [0.875 0.90625 0.90625 0.90625 0.9375 0.9375 0.93541667 0.96666667 0.83958333 0.90208333] mean value: 0.91125 key: train_roc_auc value: [0.98591549 0.97535211 0.97887324 0.97183099 0.97183099 0.97535211 0.97889786 0.97537674 0.97549985 0.97899636] mean value: 0.9767925736235595 key: test_jcc value: [0.78947368 0.84210526 0.83333333 0.83333333 0.88888889 0.88888889 0.875 0.93333333 0.72222222 0.83333333] mean value: 0.8439912280701755 key: train_jcc value: [0.97260274 0.9527027 0.95918367 0.94630872 0.94594595 0.9527027 0.95945946 0.95302013 0.9527027 0.95918367] mean value: 0.9553812459238719 MCC on Blind test: 0.12 Accuracy on Blind test: 0.35 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02524734 0.01086307 0.01085806 0.00962305 0.00966716 0.00964165 0.01023078 0.00999737 0.01087999 0.01071978] mean value: 0.011772823333740235 key: score_time value: [0.01101589 0.00973225 0.00983381 0.00891066 0.00889468 0.00895429 0.00897837 0.00976849 0.00972676 0.00902939] mean value: 0.009484457969665527 key: test_mcc value: [0.62994079 0.51639778 0.72374686 0.69991324 0.56360186 0.56360186 0.6125 0.6125 0.54812195 0.48333333] mean value: 0.5953657681593737 key: train_mcc value: [0.72009768 0.71945253 0.65494582 0.71270053 0.69351968 0.67848335 0.70693066 0.70025076 0.70041244 0.73046876] mean value: 0.7017262203984271 key: test_accuracy value: [0.8125 0.75 0.84375 0.84375 0.78125 0.78125 0.80645161 0.80645161 0.77419355 0.74193548] mean value: 0.7941532258064516 key: train_accuracy value: [0.85915493 0.85915493 0.82746479 0.8556338 0.84507042 0.83802817 0.85263158 0.84912281 0.84912281 0.86315789] mean value: 0.8498542129972819 key: test_fscore value: [0.82352941 0.77777778 0.86486486 0.85714286 0.77419355 0.77419355 0.8 0.8 0.78787879 0.75 ] mean value: 0.8009580796203187 key: train_fscore value: [0.86394558 0.8630137 0.82807018 0.86006826 0.85234899 0.84459459 0.85810811 0.85521886 0.85423729 0.86956522] mean value: 0.8549170768422738 key: test_precision value: [0.77777778 0.7 0.76190476 0.78947368 0.8 0.8 0.8 0.8 0.76470588 0.75 ] mean value: 0.7743862106246007 key: train_precision value: [0.83552632 0.84 0.82517483 0.83443709 0.81410256 0.81168831 0.83006536 0.82467532 0.82352941 0.82802548] mean value: 0.8267224676472051 key: test_recall value: [0.875 0.875 1. 0.9375 0.75 0.75 0.8 0.8 0.8125 0.75 ] mean value: 0.835 key: train_recall value: [0.8943662 0.88732394 0.83098592 0.88732394 0.8943662 0.88028169 0.88811189 0.88811189 0.88732394 0.91549296] mean value: 0.885368856495617 key: test_roc_auc value: [0.8125 0.75 0.84375 0.84375 0.78125 0.78125 0.80625 0.80625 0.77291667 0.74166667] mean value: 0.7939583333333333 key: train_roc_auc value: [0.85915493 0.85915493 0.82746479 0.8556338 0.84507042 0.83802817 0.85250665 0.84898552 0.84925638 0.86334088] mean value: 0.8498596473948588 key: test_jcc value: [0.7 0.63636364 0.76190476 0.75 0.63157895 0.63157895 0.66666667 0.66666667 0.65 0.6 ] mean value: 0.6694759626338573 key: train_jcc value: [0.76047904 0.75903614 0.70658683 0.75449102 0.74269006 0.73099415 0.75147929 0.74705882 0.74556213 0.76923077] mean value: 0.7467608254210698 MCC on Blind test: 0.25 Accuracy on Blind test: 0.54 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.08318973 0.05189371 0.05080152 0.05236053 0.0516212 0.0818789 0.05003667 0.05707097 0.05641842 0.13113332] mean value: 0.06664049625396729 key: score_time value: [0.01104307 0.01070857 0.01106858 0.01062179 0.01056743 0.01061416 0.01038623 0.01031113 0.01025176 0.01268387] mean value: 0.010825657844543457 key: test_mcc value: [0.81409158 0.93933644 0.93933644 0.93933644 0.875 1. 0.87083333 1. 0.9372467 0.87083333] mean value: 0.9186014252766943 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90625 0.96875 0.96875 0.96875 0.9375 1. 0.93548387 1. 0.96774194 0.93548387] mean value: 0.9588709677419355 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90909091 0.96969697 0.96969697 0.96969697 0.9375 1. 0.93333333 1. 0.96969697 0.9375 ] mean value: 0.9596212121212121 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.88235294 0.94117647 0.94117647 0.94117647 0.9375 1. 0.93333333 1. 0.94117647 0.9375 ] mean value: 0.9455392156862745 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9375 1. 1. 1. 0.9375 1. 0.93333333 1. 1. 0.9375 ] mean value: 0.9745833333333334 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90625 0.96875 0.96875 0.96875 0.9375 1. 0.93541667 1. 0.96666667 0.93541667] mean value: 0.95875 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.83333333 0.94117647 0.94117647 0.94117647 0.88235294 1. 0.875 1. 0.94117647 0.88235294] mean value: 0.9237745098039216 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.03 Accuracy on Blind test: 0.2 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.03818059 0.06525922 0.06525731 0.06771278 0.06377387 0.06696844 0.06272554 0.05751729 0.05288243 0.0317173 ] mean value: 0.05719947814941406 key: score_time value: [0.0220902 0.01350856 0.02089882 0.01204014 0.02447152 0.02816701 0.0265007 0.02375865 0.01212239 0.02076554] mean value: 0.020432353019714355 key: test_mcc value: [0.81409158 0.81409158 0.93933644 0.72374686 0.93933644 0.93933644 0.67916667 0.80753845 0.80753845 0.74896053] mean value: 0.8213143427561892 key: train_mcc value: [0.9860133 0.97192739 0.97183099 0.97192739 0.97889751 0.97192739 0.97202385 0.96512319 0.9720266 0.9720266 ] mean value: 0.973372421059164 key: test_accuracy value: [0.90625 0.90625 0.96875 0.84375 0.96875 0.96875 0.83870968 0.90322581 0.90322581 0.87096774] mean value: 0.9078629032258064 key: train_accuracy value: [0.99295775 0.98591549 0.98591549 0.98591549 0.98943662 0.98591549 0.98596491 0.98245614 0.98596491 0.98596491] mean value: 0.9866407215221151 key: test_fscore value: [0.90909091 0.90909091 0.96969697 0.86486486 0.96969697 0.96969697 0.83870968 0.89655172 0.90909091 0.86666667] mean value: 0.9103156569452454 key: train_fscore value: [0.99300699 0.98601399 0.98591549 0.98601399 0.98947368 0.98601399 0.98611111 0.98269896 0.98601399 0.98601399] mean value: 0.9867276173294023 key: test_precision value: [0.88235294 0.88235294 0.94117647 0.76190476 0.94117647 0.94117647 0.8125 0.92857143 0.88235294 0.92857143] mean value: 0.8902135854341736 key: train_precision value: [0.98611111 0.97916667 0.98591549 0.97916667 0.98601399 0.97916667 0.97931034 0.97260274 0.97916667 0.97916667] mean value: 0.9805787007969791 key: test_recall value: [0.9375 0.9375 1. 1. 1. 1. 0.86666667 0.86666667 0.9375 0.8125 ] mean value: 0.9358333333333333 key: train_recall value: [1. 0.99295775 0.98591549 0.99295775 0.99295775 0.99295775 0.99300699 0.99300699 0.99295775 0.99295775] mean value: 0.9929675957844972 key: test_roc_auc value: [0.90625 0.90625 0.96875 0.84375 0.96875 0.96875 0.83958333 0.90208333 0.90208333 0.87291667] mean value: 0.9079166666666667 key: train_roc_auc value: [0.99295775 0.98591549 0.98591549 0.98591549 0.98943662 0.98591549 0.98594012 0.98241899 0.98598936 0.98598936] mean value: 0.986639416921107 key: test_jcc value: [0.83333333 0.83333333 0.94117647 0.76190476 0.94117647 0.94117647 0.72222222 0.8125 0.83333333 0.76470588] mean value: 0.8384862278244631 key: train_jcc value: [0.98611111 0.97241379 0.97222222 0.97241379 0.97916667 0.97241379 0.97260274 0.96598639 0.97241379 0.97241379] mean value: 0.9738158099801092 MCC on Blind test: 0.16 Accuracy on Blind test: 0.47 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01269746 0.01149964 0.01052713 0.01056075 0.0101912 0.01011467 0.01012588 0.01005483 0.01007462 0.01022625] mean value: 0.010607242584228516 key: score_time value: [0.01169109 0.00982738 0.00970292 0.00943661 0.00925064 0.00929761 0.00929666 0.00924301 0.00927711 0.00933456] mean value: 0.00963575839996338 key: test_mcc value: [0.75 0.51639778 0.77459667 0.62994079 0.75592895 0.625 0.48333333 0.61925228 0.6310315 0.61925228] mean value: 0.640473358423418 key: train_mcc value: [0.7275383 0.69185856 0.66256355 0.70767315 0.68515743 0.70223363 0.70631586 0.63559314 0.70986095 0.66750412] mean value: 0.6896298694982799 key: test_accuracy value: [0.875 0.75 0.875 0.8125 0.875 0.8125 0.74193548 0.80645161 0.80645161 0.80645161] mean value: 0.8161290322580645 key: train_accuracy value: [0.86267606 0.84507042 0.83098592 0.85211268 0.8415493 0.84859155 0.85263158 0.81754386 0.85263158 0.83157895] mean value: 0.8435371880405238 key: test_fscore value: [0.875 0.77777778 0.88888889 0.82352941 0.88235294 0.8125 0.73333333 0.8125 0.83333333 0.8 ] mean value: 0.823921568627451 key: train_fscore value: [0.86779661 0.85034014 0.83448276 0.8590604 0.84745763 0.85714286 0.85714286 0.82191781 0.86 0.84 ] mean value: 0.8495341057152703 key: test_precision value: [0.875 0.7 0.8 0.77777778 0.83333333 0.8125 0.73333333 0.76470588 0.75 0.85714286] mean value: 0.7903793183940243 key: train_precision value: [0.83660131 0.82236842 0.81756757 0.82051282 0.81699346 0.81132075 0.83443709 0.80536913 0.8164557 0.79746835] mean value: 0.8179094599334236 key: test_recall value: [0.875 0.875 1. 0.875 0.9375 0.8125 0.73333333 0.86666667 0.9375 0.75 ] mean value: 0.86625 key: train_recall value: [0.90140845 0.88028169 0.85211268 0.90140845 0.88028169 0.9084507 0.88111888 0.83916084 0.9084507 0.88732394] mean value: 0.8839998030138876 key: test_roc_auc value: [0.875 0.75 0.875 0.8125 0.875 0.8125 0.74166667 0.80833333 0.80208333 0.80833333] mean value: 0.8160416666666667 key: train_roc_auc value: [0.86267606 0.84507042 0.83098592 0.85211268 0.8415493 0.84859155 0.85253127 0.81746774 0.85282675 0.83177386] mean value: 0.8435585541219344 key: test_jcc value: [0.77777778 0.63636364 0.8 0.7 0.78947368 0.68421053 0.57894737 0.68421053 0.71428571 0.66666667] mean value: 0.7031935900356953 key: train_jcc value: [0.76646707 0.73964497 0.71597633 0.75294118 0.73529412 0.75 0.75 0.69767442 0.75438596 0.72413793] mean value: 0.7386521976312473 MCC on Blind test: 0.23 Accuracy on Blind test: 0.57 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01371336 0.01587415 0.01515174 0.02035451 0.01662254 0.02007604 0.0158751 0.021281 0.01977491 0.01828241] mean value: 0.017700576782226564 key: score_time value: [0.00933337 0.01098824 0.0109973 0.01164103 0.01162219 0.01165724 0.01247501 0.01168752 0.01164079 0.01164222] mean value: 0.011368489265441895 key: test_mcc value: [0.75592895 0.75592895 0.53935989 0.82717019 0.875 0.68884672 0.74896053 0.60910959 0.71269665 0.6778302 ] mean value: 0.7190831661199489 key: train_mcc value: [0.94450549 0.89545487 0.68840989 0.93105621 0.92274116 0.97183099 0.86506676 0.9115139 0.90499493 0.93704438] mean value: 0.8972618583980994 key: test_accuracy value: [0.875 0.875 0.75 0.90625 0.9375 0.84375 0.87096774 0.77419355 0.83870968 0.83870968] mean value: 0.8510080645161291 key: train_accuracy value: [0.97183099 0.9471831 0.82394366 0.96478873 0.96126761 0.98591549 0.92982456 0.95438596 0.95087719 0.96842105] mean value: 0.9458438349394613 key: test_fscore value: [0.88235294 0.88235294 0.69230769 0.91428571 0.9375 0.83870968 0.875 0.69565217 0.86486486 0.84848485] mean value: 0.8431510853628459 key: train_fscore value: [0.97241379 0.94845361 0.78813559 0.96575342 0.96167247 0.98591549 0.93377483 0.95272727 0.9527027 0.96797153] mean value: 0.9429520726170258 key: test_precision value: [0.83333333 0.83333333 0.9 0.84210526 0.9375 0.86666667 0.82352941 1. 0.76190476 0.82352941] mean value: 0.8621902181925402 key: train_precision value: [0.9527027 0.9261745 0.9893617 0.94 0.95172414 0.98591549 0.88679245 0.99242424 0.91558442 0.97841727] mean value: 0.9519096909389335 key: test_recall value: [0.9375 0.9375 0.5625 1. 0.9375 0.8125 0.93333333 0.53333333 1. 0.875 ] mean value: 0.8529166666666667 key: train_recall value: [0.99295775 0.97183099 0.65492958 0.99295775 0.97183099 0.98591549 0.98601399 0.91608392 0.99295775 0.95774648] mean value: 0.9423224662661283 key: test_roc_auc value: [0.875 0.875 0.75 0.90625 0.9375 0.84375 0.87291667 0.76666667 0.83333333 0.8375 ] mean value: 0.8497916666666667 key: train_roc_auc value: [0.97183099 0.9471831 0.82394366 0.96478873 0.96126761 0.98591549 0.92962671 0.95452083 0.95102433 0.96838373] mean value: 0.9458485176795036 key: test_jcc value: [0.78947368 0.78947368 0.52941176 0.84210526 0.88235294 0.72222222 0.77777778 0.53333333 0.76190476 0.73684211] mean value: 0.7364897537962554 key: train_jcc value: [0.94630872 0.90196078 0.65034965 0.93377483 0.9261745 0.97222222 0.8757764 0.90972222 0.90967742 0.93793103] mean value: 0.8963897786374542 MCC on Blind test: 0.22 Accuracy on Blind test: 0.44 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01716423 0.01853156 0.01632261 0.01627469 0.01817465 0.01648641 0.01762009 0.0174396 0.01730609 0.01503563] mean value: 0.017035555839538575 key: score_time value: [0.01252532 0.01231885 0.01208925 0.01201677 0.01193023 0.01193023 0.01202798 0.01192021 0.01187968 0.01195002] mean value: 0.012058854103088379 key: test_mcc value: [0.75592895 0.62994079 0.67419986 0.67419986 0.75592895 0.67419986 0.82285074 0.9375 0.74689528 0.55573827] mean value: 0.7227382560801155 key: train_mcc value: [0.89939824 0.83774371 0.74290818 0.8145351 0.81662226 0.8535792 0.88699028 0.93127922 0.92393444 0.79590827] mean value: 0.8502898903501691 key: test_accuracy value: [0.875 0.8125 0.8125 0.8125 0.875 0.8125 0.90322581 0.96774194 0.87096774 0.77419355] mean value: 0.8516129032258064 key: train_accuracy value: [0.9471831 0.91549296 0.8556338 0.90140845 0.90140845 0.92253521 0.94035088 0.96491228 0.96140351 0.8877193 ] mean value: 0.9198047936743267 key: test_fscore value: [0.88235294 0.8 0.84210526 0.76923077 0.86666667 0.76923077 0.90909091 0.96774194 0.88235294 0.8 ] mean value: 0.8488772195213821 key: train_fscore value: [0.94983278 0.90977444 0.87384615 0.89230769 0.89147287 0.91666667 0.94389439 0.96598639 0.96219931 0.89873418] mean value: 0.9204714866974258 key: test_precision value: [0.83333333 0.85714286 0.72727273 1. 0.92857143 1. 0.83333333 0.9375 0.83333333 0.73684211] mean value: 0.8687329118250171 key: train_precision value: [0.9044586 0.97580645 0.77595628 0.98305085 0.99137931 0.99180328 0.89375 0.94039735 0.93959732 0.81609195] mean value: 0.9212291391435611 key: test_recall value: [0.9375 0.75 1. 0.625 0.8125 0.625 1. 1. 0.9375 0.875 ] mean value: 0.85625 key: train_recall value: [1. 0.85211268 1. 0.81690141 0.80985915 0.85211268 1. 0.99300699 0.98591549 1. ] mean value: 0.9309908401457697 key: test_roc_auc value: [0.875 0.8125 0.8125 0.8125 0.875 0.8125 0.90625 0.96875 0.86875 0.77083333] mean value: 0.8514583333333333 key: train_roc_auc value: [0.9471831 0.91549296 0.8556338 0.90140845 0.90140845 0.92253521 0.94014085 0.96481336 0.96148922 0.88811189] mean value: 0.9198217275682065 key: test_jcc value: [0.78947368 0.66666667 0.72727273 0.625 0.76470588 0.625 0.83333333 0.9375 0.78947368 0.66666667] mean value: 0.7425092644713388 key: train_jcc value: [0.9044586 0.83448276 0.77595628 0.80555556 0.8041958 0.84615385 0.89375 0.93421053 0.92715232 0.81609195] mean value: 0.8542007645624589 MCC on Blind test: 0.2 Accuracy on Blind test: 0.48 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.14941049 0.13107944 0.13037682 0.13134885 0.13136578 0.13163543 0.1313796 0.13201165 0.13207245 0.13123918] mean value: 0.13319196701049804 key: score_time value: [0.01486826 0.01501536 0.01490831 0.01498747 0.01508451 0.01519775 0.01501441 0.01499581 0.01492596 0.01497769] mean value: 0.014997553825378419 key: test_mcc value: [0.81409158 0.81409158 0.93933644 0.93933644 0.8819171 1. 0.87083333 0.9372467 1. 0.87083333] mean value: 0.906768649823811 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90625 0.90625 0.96875 0.96875 0.9375 1. 0.93548387 0.96774194 1. 0.93548387] mean value: 0.9526209677419355 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90909091 0.90909091 0.96969697 0.96969697 0.94117647 1. 0.93333333 0.96551724 1. 0.9375 ] mean value: 0.9535102802876636 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.88235294 0.88235294 0.94117647 0.94117647 0.88888889 1. 0.93333333 1. 1. 0.9375 ] mean value: 0.9406781045751634 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9375 0.9375 1. 1. 1. 1. 0.93333333 0.93333333 1. 0.9375 ] mean value: 0.9679166666666666 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90625 0.90625 0.96875 0.96875 0.9375 1. 0.93541667 0.96666667 1. 0.93541667] mean value: 0.9525 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.83333333 0.83333333 0.94117647 0.94117647 0.88888889 1. 0.875 0.93333333 1. 0.88235294] mean value: 0.912859477124183 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.06 Accuracy on Blind test: 0.21 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.0401566 0.03634596 0.04769921 0.05428123 0.03971624 0.05849838 0.05273271 0.05906415 0.04415035 0.04030848] mean value: 0.047295331954956055 key: score_time value: [0.01692843 0.02299666 0.0224371 0.03471875 0.0253973 0.01851916 0.02662086 0.0234468 0.02398276 0.0300138 ] mean value: 0.024506163597106934 key: test_mcc value: [0.81409158 0.93933644 0.93933644 1. 0.93933644 0.93933644 0.80753845 1. 0.9375 0.80833333] mean value: 0.9124809107704197 key: train_mcc value: [1. 0.99298237 0.9860133 0.9860133 1. 0.98591549 1. 0.98596474 0.9791626 0.98596474] mean value: 0.9902016540079617 key: test_accuracy value: [0.90625 0.96875 0.96875 1. 0.96875 0.96875 0.90322581 1. 0.96774194 0.90322581] mean value: 0.9555443548387097 key: train_accuracy value: [1. 0.99647887 0.99295775 0.99295775 1. 0.99295775 1. 0.99298246 0.98947368 0.99298246] mean value: 0.9950790709167284 key: test_fscore value: [0.90909091 0.96969697 0.96969697 1. 0.96774194 0.96774194 0.89655172 1. 0.96774194 0.90322581] mean value: 0.9551488185526006 key: train_fscore value: [1. 0.99646643 0.9929078 0.9929078 1. 0.99295775 1. 0.99300699 0.98932384 0.99295775] mean value: 0.9950528363313396 key: test_precision value: [0.88235294 0.94117647 0.94117647 1. 1. 1. 0.92857143 1. 1. 0.93333333] mean value: 0.9626610644257703 key: train_precision value: [1. 1. 1. 1. 1. 0.99295775 1. 0.99300699 1. 0.99295775] mean value: 0.997892248596474 key: test_recall value: [0.9375 1. 1. 1. 0.9375 0.9375 0.86666667 1. 0.9375 0.875 ] mean value: 0.9491666666666667 key: train_recall value: [1. 0.99295775 0.98591549 0.98591549 1. 0.99295775 1. 0.99300699 0.97887324 0.99295775] mean value: 0.9922584457795726 key: test_roc_auc value: [0.90625 0.96875 0.96875 1. 0.96875 0.96875 0.90208333 1. 0.96875 0.90416667] mean value: 0.955625 key: train_roc_auc value: [1. 0.99647887 0.99295775 0.99295775 1. 0.99295775 1. 0.99298237 0.98943662 0.99298237] mean value: 0.9950753471880233 key: test_jcc value: [0.83333333 0.94117647 0.94117647 1. 0.9375 0.9375 0.8125 1. 0.9375 0.82352941] mean value: 0.9164215686274509 key: train_jcc value: [1. 0.99295775 0.98591549 0.98591549 1. 0.98601399 1. 0.98611111 0.97887324 0.98601399] mean value: 0.990180105497007 MCC on Blind test: 0.06 Accuracy on Blind test: 0.21 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.06097364 0.08079934 0.08000755 0.07937479 0.07814407 0.07967234 0.09900784 0.10173464 0.09186673 0.07092118] mean value: 0.08225021362304688 key: score_time value: [0.01837087 0.02525473 0.02141833 0.02244473 0.02498221 0.02209568 0.02377081 0.02225614 0.02123618 0.0251863 ] mean value: 0.022701597213745116 key: test_mcc value: [0.68884672 0.38729833 0.77459667 0.37796447 0.38729833 0.62994079 0.61925228 0.48333333 0.35445878 0.61925228] mean value: 0.5322241996805657 key: train_mcc value: [0.97889751 0.97183099 0.97889751 0.9860133 0.99298237 0.98591549 0.98606255 0.9789707 0.98596474 0.98596474] mean value: 0.9831499896159636 key: test_accuracy value: [0.84375 0.6875 0.875 0.6875 0.6875 0.8125 0.80645161 0.74193548 0.67741935 0.80645161] mean value: 0.7626008064516129 key: train_accuracy value: [0.98943662 0.98591549 0.98943662 0.99295775 0.99647887 0.99295775 0.99298246 0.98947368 0.99298246 0.99298246] mean value: 0.9915604151223129 key: test_fscore value: [0.83870968 0.72222222 0.88888889 0.70588235 0.72222222 0.82352941 0.8125 0.73333333 0.70588235 0.8 ] mean value: 0.7753170461733081 key: train_fscore value: [0.98939929 0.98591549 0.98939929 0.9929078 0.99649123 0.99295775 0.99295775 0.98954704 0.99295775 0.99295775] mean value: 0.9915491133261819 key: test_precision value: [0.86666667 0.65 0.8 0.66666667 0.65 0.77777778 0.76470588 0.73333333 0.66666667 0.85714286] mean value: 0.743295985060691 key: train_precision value: [0.9929078 0.98591549 0.9929078 1. 0.99300699 0.99295775 1. 0.98611111 0.99295775 0.99295775] mean value: 0.992972243934935 key: test_recall value: [0.8125 0.8125 1. 0.75 0.8125 0.875 0.86666667 0.73333333 0.75 0.75 ] mean value: 0.81625 key: train_recall value: [0.98591549 0.98591549 0.98591549 0.98591549 1. 0.99295775 0.98601399 0.99300699 0.99295775 0.99295775] mean value: 0.9901556190288585 key: test_roc_auc value: [0.84375 0.6875 0.875 0.6875 0.6875 0.8125 0.80833333 0.74166667 0.675 0.80833333] mean value: 0.7627083333333333 key: train_roc_auc value: [0.98943662 0.98591549 0.98943662 0.99295775 0.99647887 0.99295775 0.99300699 0.98946124 0.99298237 0.99298237] mean value: 0.9915616074066779 key: test_jcc value: [0.72222222 0.56521739 0.8 0.54545455 0.56521739 0.7 0.68421053 0.57894737 0.54545455 0.66666667] mean value: 0.6373390657143517 key: train_jcc value: [0.97902098 0.97222222 0.97902098 0.98591549 0.99300699 0.98601399 0.98601399 0.97931034 0.98601399 0.98601399] mean value: 0.983255295511245 MCC on Blind test: 0.17 Accuracy on Blind test: 0.53 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.46722603 0.45420337 0.46067643 0.4648478 0.47499108 0.46509981 0.46426177 0.47671127 0.45206094 0.47163463] mean value: 0.4651713132858276 key: score_time value: [0.00948572 0.00930142 0.00971889 0.00974631 0.01031852 0.00943613 0.00939226 0.0101552 0.0094986 0.0096755 ] mean value: 0.009672856330871582 key: test_mcc value: [0.81409158 0.93933644 0.93933644 1. 1. 1. 0.87083333 1. 0.87770745 0.80833333] mean value: 0.9249638569805321 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.90625 0.96875 0.96875 1. 1. 1. 0.93548387 1. 0.93548387 0.90322581] mean value: 0.9617943548387097 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90909091 0.96969697 0.96969697 1. 1. 1. 0.93333333 1. 0.94117647 0.90322581] mean value: 0.962622045885803 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.88235294 0.94117647 0.94117647 1. 1. 1. 0.93333333 1. 0.88888889 0.93333333] mean value: 0.9520261437908497 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9375 1. 1. 1. 1. 1. 0.93333333 1. 1. 0.875 ] mean value: 0.9745833333333334 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90625 0.96875 0.96875 1. 1. 1. 0.93541667 1. 0.93333333 0.90416667] mean value: 0.9616666666666667 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.83333333 0.94117647 0.94117647 1. 1. 1. 0.875 1. 0.88888889 0.82352941] mean value: 0.9303104575163399 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.05 Accuracy on Blind test: 0.21 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02697706 0.02417254 0.03006434 0.02474928 0.02507758 0.02507782 0.02419448 0.02416945 0.02572727 0.03336573] mean value: 0.026357555389404298 key: score_time value: [0.01423049 0.01677632 0.01525044 0.01756954 0.01650715 0.01706696 0.01746702 0.01718903 0.017488 0.02302551] mean value: 0.017257046699523926 key: test_mcc value: [ 0.34752402 0.48038446 0.62554324 -0.13483997 0.07559289 0.32163376 -0.01581139 0.42352151 0.34258008 0.31407213] mean value: 0.27802007453897526 key: train_mcc value: [0.60447052 0.5990423 0.90582163 0.50659369 0.55022931 0.61533794 0.53458607 0.56718079 0.66601556 0.60091052] mean value: 0.6150188328105437 key: test_accuracy value: [0.65625 0.6875 0.78125 0.4375 0.53125 0.59375 0.48387097 0.67741935 0.64516129 0.64516129] mean value: 0.6139112903225806 key: train_accuracy value: [0.76760563 0.76408451 0.95070423 0.70422535 0.73239437 0.77464789 0.72280702 0.74385965 0.80701754 0.76491228] mean value: 0.7732258463059056 key: test_fscore value: [0.71794872 0.76190476 0.82051282 0.52631579 0.63414634 0.71111111 0.6 0.73684211 0.73170732 0.71794872] mean value: 0.6958437682699556 key: train_fscore value: [0.81142857 0.80911681 0.95302013 0.77173913 0.78888889 0.81609195 0.78356164 0.79665738 0.83775811 0.80911681] mean value: 0.8177379434782648 key: test_precision value: [0.60869565 0.61538462 0.69565217 0.45454545 0.52 0.55172414 0.48 0.60869565 0.6 0.60869565] mean value: 0.5743393338295887 key: train_precision value: [0.68269231 0.67942584 0.91025641 0.62831858 0.65137615 0.68932039 0.64414414 0.66203704 0.72081218 0.67942584] mean value: 0.6947808875721466 key: test_recall value: [0.875 1. 1. 0.625 0.8125 1. 0.8 0.93333333 0.9375 0.875 ] mean value: 0.8858333333333334 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.65625 0.6875 0.78125 0.4375 0.53125 0.59375 0.49375 0.68541667 0.63541667 0.6375 ] mean value: 0.6139583333333334 key: train_roc_auc value: [0.76760563 0.76408451 0.95070423 0.70422535 0.73239437 0.77464789 0.72183099 0.74295775 0.80769231 0.76573427] mean value: 0.7731877277651925 key: test_jcc value: [0.56 0.61538462 0.69565217 0.35714286 0.46428571 0.55172414 0.42857143 0.58333333 0.57692308 0.56 ] mean value: 0.5393017337485104 key: train_jcc value: [0.68269231 0.67942584 0.91025641 0.62831858 0.65137615 0.68932039 0.64414414 0.66203704 0.72081218 0.67942584] mean value: 0.6947808875721466 MCC on Blind test: 0.1 Accuracy on Blind test: 0.35 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02332354 0.04450059 0.02891874 0.03595114 0.03665876 0.03526258 0.03499365 0.03509426 0.03193188 0.03515124] mean value: 0.034178638458251955 key: score_time value: [0.02395391 0.03560877 0.0217123 0.0207777 0.02317023 0.02268767 0.02342653 0.02022815 0.02111912 0.02195239] mean value: 0.023463678359985352 key: test_mcc value: [0.81409158 0.75592895 0.93933644 0.82717019 0.875 0.81409158 0.9375 1. 0.80753845 0.80833333] mean value: 0.8578990514118684 key: train_mcc value: [0.95812669 0.94450549 0.94450549 0.93040839 0.95129413 0.93775982 0.95145657 0.92390856 0.93798423 0.93065917] mean value: 0.9410608551595332 key: test_accuracy value: [0.90625 0.875 0.96875 0.90625 0.9375 0.90625 0.96774194 1. 0.90322581 0.90322581] mean value: 0.9274193548387096 key: train_accuracy value: [0.97887324 0.97183099 0.97183099 0.96478873 0.97535211 0.96830986 0.9754386 0.96140351 0.96842105 0.96491228] mean value: 0.9701161354089449 key: test_fscore value: [0.90909091 0.88235294 0.96969697 0.91428571 0.9375 0.90322581 0.96774194 1. 0.90909091 0.90322581] mean value: 0.9296210991728069 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:175: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:178: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.97916667 0.97241379 0.97241379 0.96551724 0.97577855 0.96907216 0.97594502 0.96245734 0.96907216 0.96551724] mean value: 0.9707353967307983 key: test_precision value: [0.88235294 0.83333333 0.94117647 0.84210526 0.9375 0.93333333 0.9375 1. 0.88235294 0.93333333] mean value: 0.9122987616099071 key: train_precision value: [0.96575342 0.9527027 0.9527027 0.94594595 0.95918367 0.94630872 0.95945946 0.94 0.94630872 0.94594595] mean value: 0.9514311304548109 key: test_recall value: [0.9375 0.9375 1. 1. 0.9375 0.875 1. 1. 0.9375 0.875 ] mean value: 0.95 key: train_recall value: [0.99295775 0.99295775 0.99295775 0.98591549 0.99295775 0.99295775 0.99300699 0.98601399 0.99295775 0.98591549] mean value: 0.9908598443809712 key: test_roc_auc value: [0.90625 0.875 0.96875 0.90625 0.9375 0.90625 0.96875 1. 0.90208333 0.90416667] mean value: 0.9275 key: train_roc_auc value: [0.97887324 0.97183099 0.97183099 0.96478873 0.97535211 0.96830986 0.97537674 0.96131685 0.96850685 0.96498572] mean value: 0.970117206736925 key: test_jcc value: [0.83333333 0.78947368 0.94117647 0.84210526 0.88235294 0.82352941 0.9375 1. 0.83333333 0.82352941] mean value: 0.8706333849329205 key: train_jcc value: [0.95918367 0.94630872 0.94630872 0.93333333 0.9527027 0.94 0.95302013 0.92763158 0.94 0.93333333] mean value: 0.9431822205678743 MCC on Blind test: 0.21 Accuracy on Blind test: 0.51 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.26468396 0.23707485 0.23484063 0.23813033 0.29126167 0.26690769 0.24119735 0.23906517 0.23730087 0.24584031] mean value: 0.24963028430938722 key: score_time value: [0.02218151 0.0205338 0.02146173 0.02315283 0.02370119 0.01999259 0.0225513 0.02335763 0.02196383 0.02384853] mean value: 0.022274494171142578 key: test_mcc value: [0.81409158 0.75592895 0.93933644 0.82717019 0.875 0.81409158 0.9375 1. 0.80753845 0.80833333] mean value: 0.8578990514118684 key: train_mcc value: [0.95812669 0.94450549 0.94450549 0.93040839 0.95129413 0.93775982 0.95145657 0.92390856 0.9582759 0.93065917] mean value: 0.9430900221063301 key: test_accuracy value: [0.90625 0.875 0.96875 0.90625 0.9375 0.90625 0.96774194 1. 0.90322581 0.90322581] mean value: 0.9274193548387096 key: train_accuracy value: [0.97887324 0.97183099 0.97183099 0.96478873 0.97535211 0.96830986 0.9754386 0.96140351 0.97894737 0.96491228] mean value: 0.9711687669878923 key: test_fscore value: [0.90909091 0.88235294 0.96969697 0.91428571 0.9375 0.90322581 0.96774194 1. 0.90909091 0.90322581] mean value: 0.9296210991728069 key: train_fscore value: [0.97916667 0.97241379 0.97241379 0.96551724 0.97577855 0.96907216 0.97594502 0.96245734 0.97916667 0.96551724] mean value: 0.9717448469026196 key: test_precision value: [0.88235294 0.83333333 0.94117647 0.84210526 0.9375 0.93333333 0.9375 1. 0.88235294 0.93333333] mean value: 0.9122987616099071 key: train_precision value: [0.96575342 0.9527027 0.9527027 0.94594595 0.95918367 0.94630872 0.95945946 0.94 0.96575342 0.94594595] mean value: 0.9533756004373428 key: test_recall value: [0.9375 0.9375 1. 1. 0.9375 0.875 1. 1. 0.9375 0.875 ] mean value: 0.95 key: train_recall value: [0.99295775 0.99295775 0.99295775 0.98591549 0.99295775 0.99295775 0.99300699 0.98601399 0.99295775 0.98591549] mean value: 0.9908598443809712 key: test_roc_auc value: [0.90625 0.875 0.96875 0.90625 0.9375 0.90625 0.96875 1. 0.90208333 0.90416667] mean value: 0.9275 key: train_roc_auc value: [0.97887324 0.97183099 0.97183099 0.96478873 0.97535211 0.96830986 0.97537674 0.96131685 0.97899636 0.96498572] mean value: 0.9711661577858761 key: test_jcc value: [0.83333333 0.78947368 0.94117647 0.84210526 0.88235294 0.82352941 0.9375 1. 0.83333333 0.82352941] mean value: 0.8706333849329205 key: train_jcc value: [0.95918367 0.94630872 0.94630872 0.93333333 0.9527027 0.94 0.95302013 0.92763158 0.95918367 0.93333333] mean value: 0.9451005879148131 MCC on Blind test: 0.21 Accuracy on Blind test: 0.51 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03810716 0.03709626 0.0367434 0.04261827 0.03460145 0.05663276 0.05360341 0.04755831 0.0376575 0.03705287] mean value: 0.042167139053344724 key: score_time value: [0.01188159 0.01441073 0.01452422 0.01903272 0.01184082 0.01493359 0.01481843 0.01497293 0.01458144 0.01462245] mean value: 0.014561891555786133 key: test_mcc value: [0.81325006 0.83914639 0.83914639 0.74348441 0.77459667 0.87096774 0.69047575 0.84983659 0.73763441 0.6844511 ] mean value: 0.7842989510343097 key: train_mcc value: [0.8599849 0.85666952 0.842796 0.89965316 0.84251189 0.84624951 0.83935221 0.86718143 0.84417004 0.8393479 ] mean value: 0.8537916563444095 key: test_accuracy value: [0.90322581 0.91935484 0.91935484 0.87096774 0.88709677 0.93548387 0.83870968 0.91935484 0.86885246 0.83606557] mean value: 0.8898466419883659 key: train_accuracy value: [0.92985612 0.92805755 0.92086331 0.94964029 0.92086331 0.92266187 0.91906475 0.93345324 0.92100539 0.91921005] mean value: 0.926467587151105 key: test_fscore value: [0.90909091 0.92063492 0.92063492 0.86666667 0.88888889 0.93548387 0.85294118 0.92537313 0.86666667 0.85294118] mean value: 0.8939322330820249 key: train_fscore value: [0.93072824 0.92932862 0.92280702 0.95035461 0.92253521 0.92442882 0.92119089 0.93428064 0.92387543 0.92091388] mean value: 0.9280443373841807 key: test_precision value: [0.85714286 0.90625 0.90625 0.89655172 0.875 0.93548387 0.78378378 0.86111111 0.86666667 0.78378378] mean value: 0.8672023797593875 key: train_precision value: [0.91929825 0.91319444 0.90068493 0.93706294 0.90344828 0.90378007 0.89761092 0.92280702 0.89297659 0.90034364] mean value: 0.909120707350487 key: test_recall value: [0.96774194 0.93548387 0.93548387 0.83870968 0.90322581 0.93548387 0.93548387 1. 0.86666667 0.93548387] mean value: 0.9253763440860214 key: train_recall value: [0.94244604 0.94604317 0.94604317 0.96402878 0.94244604 0.94604317 0.94604317 0.94604317 0.95698925 0.94244604] mean value: 0.9478571981124778 key: test_roc_auc value: [0.90322581 0.91935484 0.91935484 0.87096774 0.88709677 0.93548387 0.83870968 0.91935484 0.8688172 0.8344086 ] mean value: 0.8896774193548387 key: train_roc_auc value: [0.92985612 0.92805755 0.92086331 0.94964029 0.92086331 0.92266187 0.91906475 0.93345324 0.92094067 0.9192517 ] mean value: 0.9264652793893916 key: test_jcc value: [0.83333333 0.85294118 0.85294118 0.76470588 0.8 0.87878788 0.74358974 0.86111111 0.76470588 0.74358974] mean value: 0.809570592805887 key: train_jcc value: [0.87043189 0.8679868 0.85667752 0.90540541 0.85620915 0.85947712 0.8538961 0.87666667 0.8585209 0.8534202 ] mean value: 0.8658691763036805 MCC on Blind test: 0.23 Accuracy on Blind test: 0.53 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.89820576 0.95263267 0.95077991 0.91781855 1.08768177 0.94778204 1.01672673 0.92168379 1.05719733 0.91310978] mean value: 0.9663618326187133 key: score_time value: [0.01472855 0.02327847 0.01604557 0.01537609 0.01533103 0.01232958 0.01567435 0.01526046 0.01890039 0.01533055] mean value: 0.016225504875183105 key: test_mcc value: [0.96824584 0.87096774 0.84983659 0.90748521 0.90369611 0.96824584 0.81325006 0.84266484 0.67314268 0.96770777] mean value: 0.876524268542267 key: train_mcc value: [1. 1. 0.99640932 0.97482645 0.97482645 0.97487691 1. 1. 1. 0.99641572] mean value: 0.9917354859485006 key: test_accuracy value: [0.98387097 0.93548387 0.91935484 0.9516129 0.9516129 0.98387097 0.90322581 0.91935484 0.83606557 0.98360656] mean value: 0.9368059227921735 key: train_accuracy value: [1. 1. 0.99820144 0.98741007 0.98741007 0.98741007 1. 1. 1. 0.99820467] mean value: 0.9958636322539813 key: test_fscore value: [0.98360656 0.93548387 0.9122807 0.94915254 0.95238095 0.98360656 0.90909091 0.91525424 0.82758621 0.98412698] mean value: 0.935256951963264 key: train_fscore value: [1. 1. 0.99820467 0.98743268 0.98743268 0.98747764 1. 1. 1. 0.9981982 ] mean value: 0.9958745854791948 key: test_precision value: [1. 0.93548387 1. 1. 0.9375 1. 0.85714286 0.96428571 0.85714286 0.96875 ] mean value: 0.952030529953917 key: train_precision value: [1. 1. 0.99641577 0.98566308 0.98566308 0.98220641 1. 1. 1. 1. ] mean value: 0.9949948341177821 key: test_recall value: [0.96774194 0.93548387 0.83870968 0.90322581 0.96774194 0.96774194 0.96774194 0.87096774 0.8 1. ] mean value: 0.9219354838709678 key: train_recall value: [1. 1. 1. 0.98920863 0.98920863 0.99280576 1. 1. 1. 0.99640288] mean value: 0.9967625899280576 key: test_roc_auc value: [0.98387097 0.93548387 0.91935484 0.9516129 0.9516129 0.98387097 0.90322581 0.91935484 0.83548387 0.98333333] mean value: 0.9367204301075269 key: train_roc_auc value: [1. 1. 0.99820144 0.98741007 0.98741007 0.98741007 1. 1. 1. 0.99820144] mean value: 0.995863309352518 key: test_jcc value: [0.96774194 0.87878788 0.83870968 0.90322581 0.90909091 0.96774194 0.83333333 0.84375 0.70588235 0.96875 ] mean value: 0.8817013828992007 key: train_jcc value: [1. 1. 0.99641577 0.9751773 0.9751773 0.97526502 1. 1. 1. 0.99640288] mean value: 0.9918438275904083 MCC on Blind test: 0.14 Accuracy on Blind test: 0.41 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.02257085 0.0112493 0.01054573 0.01036143 0.01050878 0.01036024 0.01019192 0.01051331 0.01032925 0.01022053] mean value: 0.01168513298034668 key: score_time value: [0.01106262 0.00959563 0.00914097 0.00893879 0.00901771 0.00893164 0.00887775 0.00896096 0.00892639 0.00891471] mean value: 0.009236717224121093 key: test_mcc value: [0.58338335 0.5483871 0.74193548 0.51639778 0.54953196 0.54953196 0.51119863 0.64549722 0.54459739 0.60733867] mean value: 0.5797799541660318 key: train_mcc value: [0.58633473 0.58998913 0.5865169 0.57494473 0.62591548 0.62262853 0.57914044 0.60433219 0.60507789 0.59092789] mean value: 0.5965807894919312 key: test_accuracy value: [0.79032258 0.77419355 0.87096774 0.75806452 0.77419355 0.77419355 0.74193548 0.82258065 0.7704918 0.80327869] mean value: 0.7880222104706505 key: train_accuracy value: [0.79316547 0.79496403 0.79316547 0.78417266 0.81294964 0.81115108 0.78956835 0.80215827 0.80251346 0.79533214] mean value: 0.7979140565465043 key: test_fscore value: [0.8 0.77419355 0.87096774 0.75409836 0.76666667 0.76666667 0.77777778 0.82539683 0.75 0.8125 ] mean value: 0.7898267587486255 key: train_fscore value: [0.7935368 0.79642857 0.79573712 0.7993311 0.81227437 0.81415929 0.78918919 0.80286738 0.80427046 0.79787234] mean value: 0.8005666638001188 key: test_precision value: [0.76470588 0.77419355 0.87096774 0.76666667 0.79310345 0.79310345 0.68292683 0.8125 0.80769231 0.78787879] mean value: 0.78537386607333 key: train_precision value: [0.7921147 0.79078014 0.78596491 0.746875 0.81521739 0.80139373 0.79061372 0.8 0.79858657 0.78671329] mean value: 0.7908259446555521 key: test_recall value: [0.83870968 0.77419355 0.87096774 0.74193548 0.74193548 0.74193548 0.90322581 0.83870968 0.7 0.83870968] mean value: 0.7990322580645162 key: train_recall value: [0.79496403 0.80215827 0.8057554 0.85971223 0.80935252 0.82733813 0.78776978 0.8057554 0.81003584 0.80935252] mean value: 0.8112194115675202 key: test_roc_auc value: [0.79032258 0.77419355 0.87096774 0.75806452 0.77419355 0.77419355 0.74193548 0.82258065 0.76935484 0.80268817] mean value: 0.7878494623655914 key: train_roc_auc value: [0.79316547 0.79496403 0.79316547 0.78417266 0.81294964 0.81115108 0.78956835 0.80215827 0.80249994 0.79535726] mean value: 0.7979152162141254 key: test_jcc value: [0.66666667 0.63157895 0.77142857 0.60526316 0.62162162 0.62162162 0.63636364 0.7027027 0.6 0.68421053] mean value: 0.6541457451983768 key: train_jcc value: [0.6577381 0.66172107 0.66076696 0.66573816 0.68389058 0.68656716 0.65178571 0.67065868 0.67261905 0.66371681] mean value: 0.6675202287084647 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01051378 0.01047087 0.01060367 0.01055789 0.01053548 0.01041293 0.01049829 0.01054573 0.01057553 0.01064825] mean value: 0.01053624153137207 key: score_time value: [0.00896335 0.00887966 0.00899315 0.00901151 0.00903106 0.00891781 0.00896287 0.00900006 0.00901461 0.00900006] mean value: 0.008977413177490234 key: test_mcc value: [0.61418277 0.64549722 0.7190925 0.61418277 0.67883359 0.64549722 0.56761348 0.64820372 0.61256703 0.60733867] mean value: 0.6353008987490497 key: train_mcc value: [0.67625899 0.66202471 0.66557529 0.69446479 0.683524 0.65510022 0.68741069 0.6870548 0.67744343 0.68448223] mean value: 0.6773339149399755 key: test_accuracy value: [0.80645161 0.82258065 0.85483871 0.80645161 0.83870968 0.82258065 0.77419355 0.82258065 0.80327869 0.80327869] mean value: 0.8154944473823373 key: train_accuracy value: [0.8381295 0.83093525 0.83273381 0.8471223 0.84172662 0.82733813 0.84352518 0.84352518 0.83842011 0.84201077] mean value: 0.8385466850935769 key: test_fscore value: [0.8125 0.82539683 0.84210526 0.8 0.83333333 0.81967213 0.8 0.83076923 0.8125 0.8125 ] mean value: 0.8188776783804825 key: train_fscore value: [0.8381295 0.83274021 0.8342246 0.84902309 0.84285714 0.83038869 0.8460177 0.8438061 0.84210526 0.84452297] mean value: 0.8403815269479368 key: test_precision value: [0.78787879 0.8125 0.92307692 0.82758621 0.86206897 0.83333333 0.71794872 0.79411765 0.76470588 0.78787879] mean value: 0.8111095251942108 key: train_precision value: [0.8381295 0.82394366 0.82685512 0.83859649 0.83687943 0.81597222 0.83275261 0.84229391 0.82474227 0.82986111] mean value: 0.8310026327326828 key: test_recall value: [0.83870968 0.83870968 0.77419355 0.77419355 0.80645161 0.80645161 0.90322581 0.87096774 0.86666667 0.83870968] mean value: 0.8318279569892473 key: train_recall value: [0.8381295 0.84172662 0.84172662 0.85971223 0.84892086 0.84532374 0.85971223 0.84532374 0.86021505 0.85971223] mean value: 0.8500502823547613 key: test_roc_auc value: [0.80645161 0.82258065 0.85483871 0.80645161 0.83870968 0.82258065 0.77419355 0.82258065 0.80430108 0.80268817] mean value: 0.8155376344086022 key: train_roc_auc value: [0.8381295 0.83093525 0.83273381 0.8471223 0.84172662 0.82733813 0.84352518 0.84352518 0.83838091 0.8420425 ] mean value: 0.8385459374435935 key: test_jcc value: [0.68421053 0.7027027 0.72727273 0.66666667 0.71428571 0.69444444 0.66666667 0.71052632 0.68421053 0.68421053] mean value: 0.6935196816775764 key: train_jcc value: [0.72136223 0.71341463 0.71559633 0.73765432 0.72839506 0.70996979 0.73312883 0.72981366 0.72727273 0.73088685] mean value: 0.7247494441137159 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00978923 0.00980663 0.01080918 0.01040244 0.01147366 0.01142359 0.01133847 0.01122069 0.01150775 0.01156354] mean value: 0.010933518409729004 key: score_time value: [0.01632094 0.01658916 0.01412439 0.01506782 0.01708412 0.01405382 0.01692677 0.01375461 0.01690364 0.01371646] mean value: 0.01545417308807373 key: test_mcc value: [0.48488114 0.42289003 0.55301004 0.52297636 0.5809475 0.52297636 0.51639778 0.42023032 0.63939757 0.27849462] mean value: 0.49422017278609187 key: train_mcc value: [0.68522881 0.73158497 0.71447096 0.73033396 0.68805267 0.69663288 0.72550886 0.71313508 0.69535672 0.67478102] mean value: 0.705508594382046 key: test_accuracy value: [0.74193548 0.70967742 0.77419355 0.75806452 0.79032258 0.75806452 0.75806452 0.70967742 0.81967213 0.63934426] mean value: 0.7459016393442623 key: train_accuracy value: [0.84172662 0.86510791 0.85611511 0.86510791 0.84352518 0.8471223 0.86151079 0.85611511 0.84739677 0.83662478] mean value: 0.8520352479237435 key: test_fscore value: [0.75 0.72727273 0.75862069 0.73684211 0.78688525 0.73684211 0.75409836 0.7 0.81355932 0.64516129] mean value: 0.7409281846368071 key: train_fscore value: [0.8358209 0.86085343 0.85018727 0.86388385 0.83918669 0.84052533 0.85553471 0.85239852 0.84460695 0.83054004] mean value: 0.8473537678320475 key: test_precision value: [0.72727273 0.68571429 0.81481481 0.80769231 0.8 0.80769231 0.76666667 0.72413793 0.82758621 0.64516129] mean value: 0.7606738538106725 key: train_precision value: [0.86821705 0.88888889 0.88671875 0.87179487 0.86311787 0.87843137 0.89411765 0.875 0.8619403 0.86100386] mean value: 0.8749230614788926 key: test_recall value: [0.77419355 0.77419355 0.70967742 0.67741935 0.77419355 0.67741935 0.74193548 0.67741935 0.8 0.64516129] mean value: 0.7251612903225806 key: train_recall value: [0.8057554 0.83453237 0.81654676 0.85611511 0.81654676 0.8057554 0.82014388 0.83093525 0.82795699 0.80215827] mean value: 0.8216446197880406 key: test_roc_auc value: [0.74193548 0.70967742 0.77419355 0.75806452 0.79032258 0.75806452 0.75806452 0.70967742 0.81935484 0.63924731] mean value: 0.7458602150537634 key: train_roc_auc value: [0.84172662 0.86510791 0.85611511 0.86510791 0.84352518 0.8471223 0.86151079 0.85611511 0.84743173 0.83656301] mean value: 0.8520325674943916 key: test_jcc value: [0.6 0.57142857 0.61111111 0.58333333 0.64864865 0.58333333 0.60526316 0.53846154 0.68571429 0.47619048] mean value: 0.5903484456116035 key: train_jcc value: [0.71794872 0.75570033 0.73941368 0.76038339 0.72292994 0.72491909 0.74754098 0.74276527 0.73101266 0.71019108] mean value: 0.7352805139150561 MCC on Blind test: 0.17 Accuracy on Blind test: 0.55 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.02437425 0.02450132 0.02496505 0.02432227 0.02489543 0.02445531 0.0243113 0.02456117 0.02471304 0.02392483] mean value: 0.02450239658355713 key: score_time value: [0.01235843 0.01249051 0.01268172 0.01241851 0.01266956 0.0124321 0.01242375 0.01243353 0.01217508 0.01202798] mean value: 0.012411117553710938 key: test_mcc value: [0.68313005 0.77784447 0.83914639 0.71004695 0.77459667 0.77459667 0.64751827 0.78446454 0.67384323 0.65552656] mean value: 0.73207137931005 key: train_mcc value: [0.79209132 0.78094965 0.78547437 0.84774592 0.77493517 0.79541168 0.80491779 0.77493517 0.7996419 0.80193561] mean value: 0.7958038569876071 key: test_accuracy value: [0.83870968 0.88709677 0.91935484 0.85483871 0.88709677 0.88709677 0.80645161 0.88709677 0.83606557 0.81967213] mean value: 0.8623479640401903 key: train_accuracy value: [0.89388489 0.88848921 0.89028777 0.92266187 0.88489209 0.89568345 0.90107914 0.88489209 0.89766607 0.8994614 ] mean value: 0.895899797217881 key: test_fscore value: [0.84848485 0.89230769 0.92063492 0.85245902 0.88888889 0.88888889 0.83333333 0.89552239 0.83870968 0.84057971] mean value: 0.8699809364555999 key: train_fscore value: [0.8991453 0.89383562 0.89608177 0.9254766 0.89115646 0.90068493 0.90500864 0.89115646 0.90289608 0.90344828] mean value: 0.9008890140313144 key: test_precision value: [0.8 0.85294118 0.90625 0.86666667 0.875 0.875 0.73170732 0.83333333 0.8125 0.76315789] mean value: 0.8316556388280602 key: train_precision value: [0.85667752 0.85294118 0.85113269 0.89297659 0.84516129 0.85947712 0.87043189 0.84516129 0.86038961 0.86754967] mean value: 0.8601898853393118 key: test_recall value: [0.90322581 0.93548387 0.93548387 0.83870968 0.90322581 0.90322581 0.96774194 0.96774194 0.86666667 0.93548387] mean value: 0.9156989247311828 key: train_recall value: [0.94604317 0.93884892 0.94604317 0.96043165 0.94244604 0.94604317 0.94244604 0.94244604 0.94982079 0.94244604] mean value: 0.9457015033134782 key: test_roc_auc value: [0.83870968 0.88709677 0.91935484 0.85483871 0.88709677 0.88709677 0.80645161 0.88709677 0.83655914 0.81774194] mean value: 0.8622043010752688 key: train_roc_auc value: [0.89388489 0.88848921 0.89028777 0.92266187 0.88489209 0.89568345 0.90107914 0.88489209 0.89757226 0.89953843] mean value: 0.8958981202135066 key: test_jcc value: [0.73684211 0.80555556 0.85294118 0.74285714 0.8 0.8 0.71428571 0.81081081 0.72222222 0.725 ] mean value: 0.7710514727465192 key: train_jcc value: [0.81677019 0.80804954 0.8117284 0.86129032 0.80368098 0.81931464 0.82649842 0.80368098 0.82298137 0.82389937] mean value: 0.8197894204757968 MCC on Blind test: 0.22 Accuracy on Blind test: 0.49 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.93708968 2.20079994 2.05276179 2.0693748 2.06285834 2.05502129 2.20690989 2.06774426 2.16239572 2.21165562] mean value: 2.1026611328125 key: score_time value: [0.0157094 0.02155018 0.02022552 0.01245332 0.01499414 0.02136111 0.01244831 0.01244354 0.02483344 0.02263308] mean value: 0.017865204811096193 key: test_mcc value: [0.87278605 0.87278605 0.84266484 0.82199494 0.87096774 0.87278605 0.74348441 0.90369611 0.71525965 0.96770777] mean value: 0.8484133607436208 key: train_mcc value: [1. 1. 1. 0.99280576 1. 1. 0.99640932 0.99640932 0.99641572 1. ] mean value: 0.9982040128141526 key: test_accuracy value: [0.93548387 0.93548387 0.91935484 0.90322581 0.93548387 0.93548387 0.87096774 0.9516129 0.85245902 0.98360656] mean value: 0.922316234796404 key: train_accuracy value: [1. 1. 1. 0.99640288 1. 1. 0.99820144 0.99820144 0.99820467 1. ] mean value: 0.9991010423259238 key: test_fscore value: [0.9375 0.9375 0.91525424 0.89285714 0.93548387 0.93333333 0.875 0.95081967 0.86153846 0.98412698] mean value: 0.9223413702242946 key: train_fscore value: [1. 1. 1. 0.99640288 1. 1. 0.99820467 0.9981982 0.99821109 1. ] mean value: 0.9991016834993942 key: test_precision value: [0.90909091 0.90909091 0.96428571 1. 0.93548387 0.96551724 0.84848485 0.96666667 0.8 0.96875 ] mean value: 0.92673701599661 key: train_precision value: [1. 1. 1. 0.99640288 1. 1. 0.99641577 1. 0.99642857 1. ] mean value: 0.9989247219735732 key: test_recall value: [0.96774194 0.96774194 0.87096774 0.80645161 0.93548387 0.90322581 0.90322581 0.93548387 0.93333333 1. ] mean value: 0.9223655913978495 key: train_recall value: [1. 1. 1. 0.99640288 1. 1. 1. 0.99640288 1. 1. ] mean value: 0.9992805755395684 key: test_roc_auc value: [0.93548387 0.93548387 0.91935484 0.90322581 0.93548387 0.93548387 0.87096774 0.9516129 0.85376344 0.98333333] mean value: 0.9224193548387097 key: train_roc_auc value: [1. 1. 1. 0.99640288 1. 1. 0.99820144 0.99820144 0.99820144 1. ] mean value: 0.9991007194244604 key: test_jcc value: [0.88235294 0.88235294 0.84375 0.80645161 0.87878788 0.875 0.77777778 0.90625 0.75675676 0.96875 ] mean value: 0.8578229908578581 key: train_jcc value: [1. 1. 1. 0.99283154 1. 1. 0.99641577 0.99640288 0.99642857 1. ] mean value: 0.998207876095437 MCC on Blind test: 0.17 Accuracy on Blind test: 0.51 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.03720474 0.02276659 0.02222443 0.02198648 0.02197099 0.02208471 0.02081251 0.02087331 0.02242088 0.01969099] mean value: 0.023203563690185548 key: score_time value: [0.00942612 0.0091238 0.00888109 0.00883532 0.00888824 0.00891638 0.0089848 0.00916314 0.00912094 0.00890946] mean value: 0.009024930000305176 key: test_mcc value: [0.96824584 0.90369611 0.96824584 0.93743687 0.90748521 1. 0.87278605 0.84266484 0.87082935 0.9344086 ] mean value: 0.9205798710435014 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.9516129 0.98387097 0.96774194 0.9516129 1. 0.93548387 0.91935484 0.93442623 0.96721311] mean value: 0.9595187731359069 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98360656 0.95081967 0.98412698 0.96666667 0.94915254 1. 0.9375 0.91525424 0.93548387 0.96774194] mean value: 0.9590352466414477 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.96666667 0.96875 1. 1. 1. 0.90909091 0.96428571 0.90625 0.96774194] mean value: 0.9682785225527161 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 1. 0.93548387 0.90322581 1. 0.96774194 0.87096774 0.96666667 0.96774194] mean value: 0.951505376344086 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.9516129 0.98387097 0.96774194 0.9516129 1. 0.93548387 0.91935484 0.93494624 0.9672043 ] mean value: 0.9595698924731183 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96774194 0.90625 0.96875 0.93548387 0.90322581 1. 0.88235294 0.84375 0.87878788 0.9375 ] mean value: 0.9223842432867575 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.01 Accuracy on Blind test: 0.2 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.1242733 0.12514329 0.12573195 0.12572408 0.12674093 0.12578678 0.12831879 0.12836862 0.12577057 0.12617683] mean value: 0.12620351314544678 key: score_time value: [0.01789355 0.01834917 0.01798558 0.01879549 0.0181818 0.01906204 0.01819825 0.01794028 0.01799631 0.01796556] mean value: 0.018236804008483886 key: test_mcc value: [0.87278605 0.87278605 0.90369611 0.93743687 0.93743687 0.83914639 0.87831007 0.90748521 0.8403496 0.8688172 ] mean value: 0.8858250418390108 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.93548387 0.93548387 0.9516129 0.96774194 0.96774194 0.91935484 0.93548387 0.9516129 0.91803279 0.93442623] mean value: 0.9416975145425701 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.9375 0.9375 0.95081967 0.96666667 0.96666667 0.92063492 0.93939394 0.95384615 0.92063492 0.93548387] mean value: 0.9429146810942157 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.90909091 0.90909091 0.96666667 1. 1. 0.90625 0.88571429 0.91176471 0.87878788 0.93548387] mean value: 0.9302849226200745 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.96774194 0.93548387 0.93548387 0.93548387 0.93548387 1. 1. 0.96666667 0.93548387] mean value: 0.9579569892473118 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.93548387 0.93548387 0.9516129 0.96774194 0.96774194 0.91935484 0.93548387 0.9516129 0.9188172 0.9344086 ] mean value: 0.9417741935483872 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.88235294 0.88235294 0.90625 0.93548387 0.93548387 0.85294118 0.88571429 0.91176471 0.85294118 0.87878788] mean value: 0.8924072847614118 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.16 Accuracy on Blind test: 0.36 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01059222 0.01075006 0.01117396 0.01065135 0.01077533 0.01067138 0.01046896 0.01045394 0.01072764 0.01058769] mean value: 0.010685253143310546 key: score_time value: [0.0090189 0.00899482 0.00891805 0.00895977 0.00898051 0.00880671 0.00890422 0.00889993 0.00933266 0.00889468] mean value: 0.008971023559570312 key: test_mcc value: [0.61807005 0.68313005 0.71004695 0.61807005 0.84266484 0.45760432 0.54953196 0.64549722 0.60645161 0.80322581] mean value: 0.6534292846193077 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.80645161 0.83870968 0.85483871 0.80645161 0.91935484 0.72580645 0.77419355 0.82258065 0.80327869 0.90163934] mean value: 0.8253305129561079 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.79310345 0.82758621 0.85714286 0.79310345 0.91525424 0.70175439 0.76666667 0.81967213 0.8 0.90322581] mean value: 0.8177509188110001 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.85185185 0.88888889 0.84375 0.85185185 0.96428571 0.76923077 0.79310345 0.83333333 0.8 0.90322581] mean value: 0.8499521664169885 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.74193548 0.77419355 0.87096774 0.74193548 0.87096774 0.64516129 0.74193548 0.80645161 0.8 0.90322581] mean value: 0.7896774193548387 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.80645161 0.83870968 0.85483871 0.80645161 0.91935484 0.72580645 0.77419355 0.82258065 0.80322581 0.9016129 ] mean value: 0.8253225806451613 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.65714286 0.70588235 0.75 0.65714286 0.84375 0.54054054 0.62162162 0.69444444 0.66666667 0.82352941] mean value: 0.696072075226487 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.1 Accuracy on Blind test: 0.43 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.83548903 1.8379004 1.86891341 1.86416602 1.86113143 1.84892035 1.83393502 1.84949994 1.83295679 1.83171296] mean value: 1.8464625358581543 key: score_time value: [0.0921917 0.09853959 0.09397936 0.09889221 0.09865165 0.09259009 0.09273338 0.09274316 0.09208941 0.09240556] mean value: 0.09448161125183105 key: test_mcc value: [0.90369611 0.96824584 0.96824584 0.96824584 0.96824584 1. 0.87831007 0.96824584 0.87082935 0.93635873] mean value: 0.9430423448212051 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9516129 0.98387097 0.98387097 0.98387097 0.98387097 1. 0.93548387 0.98387097 0.93442623 0.96721311] mean value: 0.970809095716552 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.95238095 0.98412698 0.98360656 0.98360656 0.98360656 1. 0.93939394 0.98412698 0.93548387 0.96875 ] mean value: 0.9715082403127749 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9375 0.96875 1. 1. 1. 1. 0.88571429 0.96875 0.90625 0.93939394] mean value: 0.9606358225108225 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 1. 0.96774194 0.96774194 0.96774194 1. 1. 1. 0.96666667 1. ] mean value: 0.983763440860215 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9516129 0.98387097 0.98387097 0.98387097 0.98387097 1. 0.93548387 0.98387097 0.93494624 0.96666667] mean value: 0.9708064516129032 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.90909091 0.96875 0.96774194 0.96774194 0.96774194 1. 0.88571429 0.96875 0.87878788 0.93939394] mean value: 0.9453712819438626 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.07 Accuracy on Blind test: 0.21 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000...05', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.00086403 1.01194215 1.00798345 1.02354908 0.99338937 1.05827212 0.98856235 0.96865797 1.00004292 1.03345537] mean value: 1.0086718797683716 key: score_time value: [0.2509551 0.21219444 0.20999908 0.22287941 0.26200318 0.24685287 0.26816392 0.22774673 0.2380302 0.17141843] mean value: 0.23102433681488038 key: test_mcc value: [0.84266484 0.90369611 0.96824584 0.96824584 0.96824584 0.96824584 0.82199494 0.96824584 0.90215054 0.87055472] mean value: 0.9182290332262679 key: train_mcc value: [0.98563702 0.97844259 0.98207157 0.98207157 0.97844259 0.98207157 0.97487691 0.98563702 0.98566253 0.97848145] mean value: 0.9813394821071346 key: test_accuracy value: [0.91935484 0.9516129 0.98387097 0.98387097 0.98387097 0.98387097 0.90322581 0.98387097 0.95081967 0.93442623] mean value: 0.9578794288736119 key: train_accuracy value: [0.99280576 0.98920863 0.99100719 0.99100719 0.98920863 0.99100719 0.98741007 0.99280576 0.99281867 0.98922801] mean value: 0.9906507110290224 key: test_fscore value: [0.92307692 0.95238095 0.98360656 0.98360656 0.98360656 0.98360656 0.91176471 0.98412698 0.95081967 0.9375 ] mean value: 0.9594095467106557 key: train_fscore value: [0.99283154 0.98924731 0.99105546 0.99105546 0.98924731 0.99105546 0.98747764 0.99283154 0.99285714 0.98924731] mean value: 0.9906906167933925 key: test_precision value: [0.88235294 0.9375 1. 1. 1. 1. 0.83783784 0.96875 0.93548387 0.90909091] mean value: 0.9471015559072959 key: train_precision value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.98928571 0.98571429 0.98576512 0.98576512 0.98571429 0.98576512 0.98220641 0.98928571 0.98932384 0.98571429] mean value: 0.9864539908490086 key: test_recall value: [0.96774194 0.96774194 0.96774194 0.96774194 0.96774194 0.96774194 1. 1. 0.96666667 0.96774194] mean value: 0.9740860215053764 key: train_recall value: [0.99640288 0.99280576 0.99640288 0.99640288 0.99280576 0.99640288 0.99280576 0.99640288 0.99641577 0.99280576] mean value: 0.9949653180681262 key: test_roc_auc value: [0.91935484 0.9516129 0.98387097 0.98387097 0.98387097 0.98387097 0.90322581 0.98387097 0.95107527 0.93387097] mean value: 0.9578494623655914 key: train_roc_auc value: [0.99280576 0.98920863 0.99100719 0.99100719 0.98920863 0.99100719 0.98741007 0.99280576 0.9928122 0.98923442] mean value: 0.9906507052422577 key: test_jcc value: [0.85714286 0.90909091 0.96774194 0.96774194 0.96774194 0.96774194 0.83783784 0.96875 0.90625 0.88235294] mean value: 0.9232392287183558 key: train_jcc value: [0.98576512 0.9787234 0.9822695 0.9822695 0.9787234 0.9822695 0.97526502 0.98576512 0.9858156 0.9787234 ] mean value: 0.9815589593019299 MCC on Blind test: 0.1 Accuracy on Blind test: 0.25 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02520299 0.01059842 0.01069093 0.0106535 0.01082826 0.01074505 0.0108285 0.01074529 0.01063871 0.01066732] mean value: 0.012159895896911622 key: score_time value: [0.01116538 0.00920391 0.00933433 0.00932336 0.0090549 0.00906706 0.00910687 0.00905514 0.00927353 0.00904298] mean value: 0.009362745285034179 key: test_mcc value: [0.61418277 0.64549722 0.7190925 0.61418277 0.67883359 0.64549722 0.56761348 0.64820372 0.61256703 0.60733867] mean value: 0.6353008987490497 key: train_mcc value: [0.67625899 0.66202471 0.66557529 0.69446479 0.683524 0.65510022 0.68741069 0.6870548 0.67744343 0.68448223] mean value: 0.6773339149399755 key: test_accuracy value: [0.80645161 0.82258065 0.85483871 0.80645161 0.83870968 0.82258065 0.77419355 0.82258065 0.80327869 0.80327869] mean value: 0.8154944473823373 key: train_accuracy value: [0.8381295 0.83093525 0.83273381 0.8471223 0.84172662 0.82733813 0.84352518 0.84352518 0.83842011 0.84201077] mean value: 0.8385466850935769 key: test_fscore value: [0.8125 0.82539683 0.84210526 0.8 0.83333333 0.81967213 0.8 0.83076923 0.8125 0.8125 ] mean value: 0.8188776783804825 key: train_fscore value: [0.8381295 0.83274021 0.8342246 0.84902309 0.84285714 0.83038869 0.8460177 0.8438061 0.84210526 0.84452297] mean value: 0.8403815269479368 key: test_precision value: [0.78787879 0.8125 0.92307692 0.82758621 0.86206897 0.83333333 0.71794872 0.79411765 0.76470588 0.78787879] mean value: 0.8111095251942108 key: train_precision value: [0.8381295 0.82394366 0.82685512 0.83859649 0.83687943 0.81597222 0.83275261 0.84229391 0.82474227 0.82986111] mean value: 0.8310026327326828 key: test_recall value: [0.83870968 0.83870968 0.77419355 0.77419355 0.80645161 0.80645161 0.90322581 0.87096774 0.86666667 0.83870968] mean value: 0.8318279569892473 key: train_recall value: [0.8381295 0.84172662 0.84172662 0.85971223 0.84892086 0.84532374 0.85971223 0.84532374 0.86021505 0.85971223] mean value: 0.8500502823547613 key: test_roc_auc value: [0.80645161 0.82258065 0.85483871 0.80645161 0.83870968 0.82258065 0.77419355 0.82258065 0.80430108 0.80268817] mean value: 0.8155376344086022 key: train_roc_auc value: [0.8381295 0.83093525 0.83273381 0.8471223 0.84172662 0.82733813 0.84352518 0.84352518 0.83838091 0.8420425 ] mean value: 0.8385459374435935 key: test_jcc value: [0.68421053 0.7027027 0.72727273 0.66666667 0.71428571 0.69444444 0.66666667 0.71052632 0.68421053 0.68421053] mean value: 0.6935196816775764 key: train_jcc value: [0.72136223 0.71341463 0.71559633 0.73765432 0.72839506 0.70996979 0.73312883 0.72981366 0.72727273 0.73088685] mean value: 0.7247494441137159 MCC on Blind test: 0.19 Accuracy on Blind test: 0.52 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.09478092 0.07708526 0.08946919 0.23117948 0.07251072 0.07828522 0.08111978 0.0676651 0.07024503 0.07174253] mean value: 0.09340832233428956 key: score_time value: [0.01199675 0.01191998 0.01227951 0.0113728 0.01115179 0.01123452 0.01101041 0.01072311 0.01149058 0.01141882] mean value: 0.011459827423095703 key: test_mcc value: [0.96824584 0.93743687 0.96824584 1. 1. 1. 0.87278605 0.93548387 0.90215054 0.96770777] mean value: 0.9552056769139011 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.96774194 0.98387097 1. 1. 1. 0.93548387 0.96774194 0.95081967 0.98360656] mean value: 0.9773135906927551 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98360656 0.96666667 0.98360656 1. 1. 1. 0.9375 0.96774194 0.95081967 0.98412698] mean value: 0.9774068373162768 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 1. 1. 1. 0.90909091 0.96774194 0.93548387 0.96875 ] mean value: 0.9781066715542522 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 0.96774194 1. 1. 1. 0.96774194 0.96774194 0.96666667 1. ] mean value: 0.9773118279569892 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.96774194 0.98387097 1. 1. 1. 0.93548387 0.96774194 0.95107527 0.98333333] mean value: 0.9773118279569892 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96774194 0.93548387 0.96774194 1. 1. 1. 0.88235294 0.9375 0.90625 0.96875 ] mean value: 0.9565820683111954 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04589272 0.06550336 0.05837011 0.08017612 0.0648365 0.09038734 0.06400347 0.06171775 0.09057832 0.05633402] mean value: 0.06777997016906738 key: score_time value: [0.01948833 0.01549053 0.01913047 0.01229525 0.01808548 0.01914907 0.01216698 0.01918221 0.01920176 0.01234913] mean value: 0.016653919219970705 key: test_mcc value: [0.93743687 0.87096774 0.87096774 0.80813523 0.78446454 0.87278605 0.93743687 0.84266484 0.80516731 0.8688172 ] mean value: 0.8598844392978637 key: train_mcc value: [0.97124816 0.96768225 0.96073627 0.96058703 0.96405373 0.95353974 0.94986154 0.94634322 0.94643646 0.95362457] mean value: 0.9574112963993591 key: test_accuracy value: [0.96774194 0.93548387 0.93548387 0.90322581 0.88709677 0.93548387 0.96774194 0.91935484 0.90163934 0.93442623] mean value: 0.9287678476996298 key: train_accuracy value: [0.98561151 0.98381295 0.98021583 0.98021583 0.98201439 0.97661871 0.97482014 0.97302158 0.97307002 0.97666068] mean value: 0.9786061635431331 key: test_fscore value: [0.96666667 0.93548387 0.93548387 0.9 0.87719298 0.93333333 0.96875 0.91525424 0.90322581 0.93548387] mean value: 0.9270874639099115 key: train_fscore value: [0.98566308 0.98389982 0.98046181 0.98039216 0.98207885 0.97690941 0.97508897 0.97335702 0.97345133 0.97690941] mean value: 0.9788211864278304 key: test_precision value: [1. 0.93548387 0.93548387 0.93103448 0.96153846 0.96551724 0.93939394 0.96428571 0.875 0.93548387] mean value: 0.9443221452259272 key: train_precision value: [0.98214286 0.97864769 0.96842105 0.97173145 0.97857143 0.96491228 0.96478873 0.96140351 0.96153846 0.96491228] mean value: 0.9697069738050123 key: test_recall value: [0.93548387 0.93548387 0.93548387 0.87096774 0.80645161 0.90322581 1. 0.87096774 0.93333333 0.93548387] mean value: 0.9126881720430107 key: train_recall value: [0.98920863 0.98920863 0.99280576 0.98920863 0.98561151 0.98920863 0.98561151 0.98561151 0.98566308 0.98920863] mean value: 0.9881346535674685 key: test_roc_auc value: [0.96774194 0.93548387 0.93548387 0.90322581 0.88709677 0.93548387 0.96774194 0.91935484 0.90215054 0.9344086 ] mean value: 0.9288172043010753 key: train_roc_auc value: [0.98561151 0.98381295 0.98021583 0.98021583 0.98201439 0.97661871 0.97482014 0.97302158 0.97304737 0.97668317] mean value: 0.9786061473401924 key: test_jcc value: [0.93548387 0.87878788 0.87878788 0.81818182 0.78125 0.875 0.93939394 0.84375 0.82352941 0.87878788] mean value: 0.8652952676671841 key: train_jcc value: [0.97173145 0.96830986 0.96167247 0.96153846 0.96478873 0.95486111 0.95138889 0.94809689 0.94827586 0.95486111] mean value: 0.9585524834711829 MCC on Blind test: 0.13 Accuracy on Blind test: 0.37 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02431893 0.01048541 0.01121306 0.01110697 0.01107454 0.01016903 0.00993896 0.01057076 0.01003838 0.01006365] mean value: 0.011897969245910644 key: score_time value: [0.00927353 0.00901461 0.00943899 0.00953412 0.00949192 0.00871015 0.00875497 0.00883937 0.00874853 0.00869203] mean value: 0.009049820899963378 key: test_mcc value: [0.61807005 0.67883359 0.83914639 0.54953196 0.61290323 0.67883359 0.52981294 0.7130241 0.54459739 0.54459739] mean value: 0.6309350624207806 key: train_mcc value: [0.64855706 0.64509217 0.63130015 0.66326227 0.66702732 0.6419512 0.65158942 0.63788443 0.65388715 0.6707996 ] mean value: 0.6511350775809711 key: test_accuracy value: [0.80645161 0.83870968 0.91935484 0.77419355 0.80645161 0.83870968 0.75806452 0.85483871 0.7704918 0.7704918 ] mean value: 0.8137757800105764 key: train_accuracy value: [0.82374101 0.82194245 0.8147482 0.83093525 0.83273381 0.82014388 0.82553957 0.81834532 0.82585278 0.83482944] mean value: 0.8248811722614727 key: test_fscore value: [0.81818182 0.84375 0.92063492 0.76666667 0.80645161 0.84375 0.7826087 0.86153846 0.75 0.78787879] mean value: 0.8181460963456055 key: train_fscore value: [0.82867133 0.82722513 0.82149047 0.83623693 0.83826087 0.82638889 0.82892416 0.82373473 0.83304647 0.83916084] mean value: 0.830313982226392 key: test_precision value: [0.77142857 0.81818182 0.90625 0.79310345 0.80645161 0.81818182 0.71052632 0.82352941 0.80769231 0.74285714] mean value: 0.7998202447074926 key: train_precision value: [0.80612245 0.80338983 0.79264214 0.81081081 0.81144781 0.79865772 0.81314879 0.8 0.8013245 0.81632653] mean value: 0.805387058318656 key: test_recall value: [0.87096774 0.87096774 0.93548387 0.74193548 0.80645161 0.87096774 0.87096774 0.90322581 0.7 0.83870968] mean value: 0.8409677419354838 key: train_recall value: [0.85251799 0.85251799 0.85251799 0.86330935 0.86690647 0.85611511 0.84532374 0.84892086 0.86738351 0.86330935] mean value: 0.8568822361465667 key: test_roc_auc value: [0.80645161 0.83870968 0.91935484 0.77419355 0.80645161 0.83870968 0.75806452 0.85483871 0.76935484 0.76935484] mean value: 0.8135483870967741 key: train_roc_auc value: [0.82374101 0.82194245 0.8147482 0.83093525 0.83273381 0.82014388 0.82553957 0.81834532 0.82577809 0.83488048] mean value: 0.8248788066321137 key: test_jcc value: [0.69230769 0.72972973 0.85294118 0.62162162 0.67567568 0.72972973 0.64285714 0.75675676 0.6 0.65 ] mean value: 0.6951619525148937 key: train_jcc value: [0.70746269 0.70535714 0.69705882 0.71856287 0.72155689 0.70414201 0.70783133 0.70029674 0.71386431 0.72289157] mean value: 0.7099024359523051 MCC on Blind test: 0.19 Accuracy on Blind test: 0.55 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01834035 0.0359242 0.02628255 0.0305891 0.02540517 0.03625464 0.03056884 0.02949548 0.03100276 0.03467536] mean value: 0.02985384464263916 key: score_time value: [0.00996852 0.01121902 0.01181126 0.01182604 0.01179433 0.01183081 0.01177526 0.01184487 0.01179695 0.01184797] mean value: 0.011571502685546875 key: test_mcc value: [0.51507875 0.87278605 0.90748521 1. 0.87831007 0.90748521 0.81325006 0.87278605 0.77454559 0.93635873] mean value: 0.8478085723184035 key: train_mcc value: [0.59972626 0.97487691 0.94986154 0.97487691 0.90161686 0.9393413 0.97482645 0.97487691 0.97127459 0.97492232] mean value: 0.9236200060442982 key: test_accuracy value: [0.70967742 0.93548387 0.9516129 1. 0.93548387 0.9516129 0.90322581 0.93548387 0.8852459 0.96721311] mean value: 0.9175039661554732 key: train_accuracy value: [0.76618705 0.98741007 0.97482014 0.98741007 0.94964029 0.96942446 0.98741007 0.98741007 0.98563734 0.98743268] mean value: 0.9582782248169148 key: test_fscore value: [0.59090909 0.9375 0.94915254 1. 0.93103448 0.94915254 0.90909091 0.93333333 0.88888889 0.96875 ] mean value: 0.9057811789726605 key: train_fscore value: [0.69626168 0.98747764 0.97454545 0.98747764 0.94776119 0.96892139 0.98743268 0.98747764 0.98566308 0.98747764] mean value: 0.9510496032258882 key: test_precision value: [1. 0.90909091 1. 1. 1. 1. 0.85714286 0.96551724 0.84848485 0.93939394] mean value: 0.9519629795491864 key: train_precision value: [0.99333333 0.98220641 0.98529412 0.98220641 0.98449612 0.98513011 0.98566308 0.98220641 0.98566308 0.98220641] mean value: 0.9848405474185916 key: test_recall value: [0.41935484 0.96774194 0.90322581 1. 0.87096774 0.90322581 0.96774194 0.90322581 0.93333333 1. ] mean value: 0.8868817204301075 key: train_recall value: [0.53597122 0.99280576 0.96402878 0.99280576 0.91366906 0.95323741 0.98920863 0.99280576 0.98566308 0.99280576] mean value: 0.9313001211933679 key: test_roc_auc value: [0.70967742 0.93548387 0.9516129 1. 0.93548387 0.9516129 0.90322581 0.93548387 0.88602151 0.96666667] mean value: 0.9175268817204302 key: train_roc_auc value: [0.76618705 0.98741007 0.97482014 0.98741007 0.94964029 0.96942446 0.98741007 0.98741007 0.9856373 0.9874423 ] mean value: 0.9582791831051288 key: test_jcc value: [0.41935484 0.88235294 0.90322581 1. 0.87096774 0.90322581 0.83333333 0.875 0.8 0.93939394] mean value: 0.842685440745213 key: train_jcc value: [0.53405018 0.97526502 0.95035461 0.97526502 0.90070922 0.93971631 0.9751773 0.97526502 0.97173145 0.97526502] mean value: 0.917279914545461 MCC on Blind test: 0.15 Accuracy on Blind test: 0.51 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.02572179 0.02486515 0.01888251 0.0222733 0.02065015 0.02569151 0.01913404 0.02014589 0.02283072 0.01933765] mean value: 0.021953272819519042 key: score_time value: [0.0119679 0.01191592 0.01183343 0.01205277 0.01181006 0.01178956 0.01178837 0.0118444 0.01184034 0.01183796] mean value: 0.01186807155609131 key: test_mcc value: [0.81325006 0.83914639 0.74193548 0.90369611 0.87831007 1. 0.81325006 0.81325006 0.80516731 0.78156791] mean value: 0.8389573460852846 key: train_mcc value: [0.96425338 0.93301383 0.91404761 0.96425338 0.87765675 0.91941603 0.91755711 0.87385975 0.90947207 0.83401471] mean value: 0.910754464045474 key: test_accuracy value: [0.90322581 0.91935484 0.87096774 0.9516129 0.93548387 1. 0.90322581 0.90322581 0.90163934 0.8852459 ] mean value: 0.9173982020095187 key: train_accuracy value: [0.98201439 0.96582734 0.95683453 0.98201439 0.93705036 0.95863309 0.95863309 0.93345324 0.95332136 0.91202873] mean value: 0.9539810521421284 key: test_fscore value: [0.90909091 0.91803279 0.87096774 0.95238095 0.93103448 1. 0.90909091 0.90909091 0.90322581 0.87719298] mean value: 0.9180107480140782 key: train_fscore value: [0.98220641 0.96487985 0.95744681 0.98220641 0.93408663 0.96 0.95914742 0.93739425 0.95517241 0.90448343] mean value: 0.9537023617168902 key: test_precision value: [0.85714286 0.93333333 0.87096774 0.9375 1. 1. 0.85714286 0.85714286 0.875 0.96153846] mean value: 0.914976810823585 key: train_precision value: [0.97183099 0.99239544 0.94405594 0.97183099 0.98023715 0.92929293 0.94736842 0.88498403 0.92026578 0.98723404] mean value: 0.952949570648824 key: test_recall value: [0.96774194 0.90322581 0.87096774 0.96774194 0.87096774 1. 0.96774194 0.96774194 0.93333333 0.80645161] mean value: 0.9255913978494623 key: train_recall value: [0.99280576 0.93884892 0.97122302 0.99280576 0.89208633 0.99280576 0.97122302 0.99640288 0.99283154 0.83453237] mean value: 0.9575565354168278 key: test_roc_auc value: [0.90322581 0.91935484 0.87096774 0.9516129 0.93548387 1. 0.90322581 0.90322581 0.90215054 0.88655914] mean value: 0.9175806451612903 key: train_roc_auc value: [0.98201439 0.96582734 0.95683453 0.98201439 0.93705036 0.95863309 0.95863309 0.93345324 0.9532503 0.91188984] mean value: 0.9539600577602434 key: test_jcc value: [0.83333333 0.84848485 0.77142857 0.90909091 0.87096774 1. 0.83333333 0.83333333 0.82352941 0.78125 ] mean value: 0.8504751482704519 key: train_jcc value: [0.96503497 0.93214286 0.91836735 0.96503497 0.87632509 0.92307692 0.92150171 0.88216561 0.91419142 0.82562278] mean value: 0.9123463652090518 MCC on Blind test: 0.18 Accuracy on Blind test: 0.88 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.2011168 0.18491268 0.18607616 0.19021749 0.18520522 0.18583274 0.19318509 0.18674707 0.18786812 0.18651438] mean value: 0.18876757621765136 key: score_time value: [0.01528382 0.01524591 0.01567864 0.01531458 0.01553869 0.01541805 0.0158565 0.01555061 0.01530504 0.0157001 ] mean value: 0.01548919677734375 key: test_mcc value: [0.96824584 0.96824584 0.96824584 0.90748521 0.96824584 1. 0.87278605 0.93548387 0.90215054 0.96770777] mean value: 0.9458596788654657 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.98387097 0.98387097 0.9516129 0.98387097 1. 0.93548387 0.96774194 0.95081967 0.98360656] mean value: 0.9724748810153359 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98360656 0.98412698 0.98360656 0.94915254 0.98360656 1. 0.9375 0.96774194 0.95081967 0.98412698] mean value: 0.9724287790373015 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.96875 1. 1. 1. 1. 0.90909091 0.96774194 0.93548387 0.96875 ] mean value: 0.9749816715542522 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 1. 0.96774194 0.90322581 0.96774194 1. 0.96774194 0.96774194 0.96666667 1. ] mean value: 0.9708602150537634 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.98387097 0.98387097 0.9516129 0.98387097 1. 0.93548387 0.96774194 0.95107527 0.98333333] mean value: 0.97247311827957 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96774194 0.96875 0.96774194 0.90322581 0.96774194 1. 0.88235294 0.9375 0.90625 0.96875 ] mean value: 0.9470054554079697 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.09 Accuracy on Blind test: 0.2 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.07024312 0.06641197 0.09449553 0.06638598 0.07757902 0.08163714 0.08580494 0.07853961 0.09642267 0.08542395] mean value: 0.08029439449310302 key: score_time value: [0.03562093 0.03045201 0.03463125 0.02480125 0.04036546 0.03743124 0.03029346 0.03446913 0.04009056 0.03119063] mean value: 0.033934593200683594 key: test_mcc value: [0.96824584 0.93743687 0.96824584 0.93743687 0.96824584 1. 0.96824584 0.84266484 0.90215054 0.96770777] mean value: 0.9460380230276315 key: train_mcc value: [0.99640932 0.99640932 0.98926624 0.99283145 0.99640932 0.99640932 0.98563702 1. 0.99641572 0.99641572] mean value: 0.9946203448868983 key: test_accuracy value: [0.98387097 0.96774194 0.98387097 0.96774194 0.98387097 1. 0.98387097 0.91935484 0.95081967 0.98360656] mean value: 0.9724748810153359 key: train_accuracy value: [0.99820144 0.99820144 0.99460432 0.99640288 0.99820144 0.99820144 0.99280576 1. 0.99820467 0.99820467] mean value: 0.9973028040763081 key: test_fscore value: [0.98360656 0.96666667 0.98360656 0.96666667 0.98360656 1. 0.98412698 0.91525424 0.95081967 0.98412698] mean value: 0.9718480883137732 key: train_fscore value: [0.9981982 0.9981982 0.99457505 0.99638989 0.99820467 0.9981982 0.99277978 1. 0.99821109 0.9981982 ] mean value: 0.9972953272188904 key: test_precision value: [1. 1. 1. 1. 1. 1. 0.96875 0.96428571 0.93548387 0.96875 ] mean value: 0.9837269585253456 key: train_precision value: [1. 1. 1. 1. 0.99641577 1. 0.99637681 1. 0.99642857 1. ] mean value: 0.9989221153632093 key: test_recall value: [0.96774194 0.93548387 0.96774194 0.93548387 0.96774194 1. 1. 0.87096774 0.96666667 1. ] mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis] 0.9611827956989247 key: train_recall value: [0.99640288 0.99640288 0.98920863 0.99280576 1. 0.99640288 0.98920863 1. 1. 0.99640288] mean value: 0.99568345323741 key: test_roc_auc value: [0.98387097 0.96774194 0.98387097 0.96774194 0.98387097 1. 0.98387097 0.91935484 0.95107527 0.98333333] mean value: 0.97247311827957 key: train_roc_auc value: [0.99820144 0.99820144 0.99460432 0.99640288 0.99820144 0.99820144 0.99280576 1. 0.99820144 0.99820144] mean value: 0.9973021582733813 key: test_jcc value: [0.96774194 0.93548387 0.96774194 0.93548387 0.96774194 1. 0.96875 0.84375 0.90625 0.96875 ] mean value: 0.9461693548387097 key: train_jcc value: [0.99640288 0.99640288 0.98920863 0.99280576 0.99641577 0.99640288 0.98566308 1. 0.99642857 0.99640288] mean value: 0.9946133323755741 MCC on Blind test: 0.01 Accuracy on Blind test: 0.2 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.19087005 0.18680048 0.12759018 0.22758508 0.21411705 0.20253134 0.20308423 0.20282555 0.17823601 0.17444253] mean value: 0.19080824851989747 key: score_time value: [0.0271349 0.01631045 0.01627851 0.03460932 0.03289127 0.02667069 0.02678728 0.02722788 0.02839899 0.03015685] mean value: 0.02664661407470703 key: test_mcc value: [0.77459667 0.7130241 0.74819006 0.74819006 0.83914639 0.7130241 0.61290323 0.74348441 0.83984455 0.77096774] mean value: 0.7503371290040649 key: train_mcc value: [0.96043787 0.97122302 0.97124816 0.97122302 0.96402878 0.96402878 0.97487691 0.97124816 0.97127459 0.95693712] mean value: 0.9676526401636363 key: test_accuracy value: [0.88709677 0.85483871 0.87096774 0.87096774 0.91935484 0.85483871 0.80645161 0.87096774 0.91803279 0.8852459 ] mean value: 0.8738762559492332 key: train_accuracy value: [0.98021583 0.98561151 0.98561151 0.98561151 0.98201439 0.98201439 0.98741007 0.98561151 0.98563734 0.97845601] mean value: 0.9838194076695556 key: test_fscore value: [0.8852459 0.86153846 0.86206897 0.86206897 0.91803279 0.84745763 0.80645161 0.86666667 0.9122807 0.8852459 ] mean value: 0.8707057591179801 key: train_fscore value: [0.98025135 0.98561151 0.98555957 0.98561151 0.98201439 0.98201439 0.98734177 0.98555957 0.98566308 0.97849462] mean value: 0.9838121756879349 key: test_precision value: [0.9 0.82352941 0.92592593 0.92592593 0.93333333 0.89285714 0.80645161 0.89655172 0.96296296 0.9 ] mean value: 0.8967538039811154 key: train_precision value: [0.97849462 0.98561151 0.98913043 0.98561151 0.98201439 0.98201439 0.99272727 0.98913043 0.98566308 0.975 ] mean value: 0.9845397646946831 key: test_recall value: [0.87096774 0.90322581 0.80645161 0.80645161 0.90322581 0.80645161 0.80645161 0.83870968 0.86666667 0.87096774] mean value: 0.8479569892473118 key: train_recall value: [0.98201439 0.98561151 0.98201439 0.98561151 0.98201439 0.98201439 0.98201439 0.98201439 0.98566308 0.98201439] mean value: 0.983098682344447 key: test_roc_auc value: [0.88709677 0.85483871 0.87096774 0.87096774 0.91935484 0.85483871 0.80645161 0.87096774 0.9172043 0.88548387] mean value: 0.8738172043010752 key: train_roc_auc value: [0.98021583 0.98561151 0.98561151 0.98561151 0.98201439 0.98201439 0.98741007 0.98561151 0.9856373 0.97846239] mean value: 0.9838200407416002 key: test_jcc value: [0.79411765 0.75675676 0.75757576 0.75757576 0.84848485 0.73529412 0.67567568 0.76470588 0.83870968 0.79411765] mean value: 0.7723013767605797 key: train_jcc value: [0.96126761 0.97163121 0.97153025 0.97163121 0.96466431 0.96466431 0.975 0.97153025 0.97173145 0.95789474] mean value: 0.9681545322715445 MCC on Blind test: 0.19 Accuracy on Blind test: 0.49 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.79381752 0.74333286 0.76294351 0.77773452 0.77165318 0.73637152 0.75150609 0.73600364 0.74480796 0.75842977] mean value: 0.7576600551605225 key: score_time value: [0.01070094 0.00964355 0.01037812 0.01054597 0.00949955 0.00938725 0.00942111 0.00961471 0.00962543 0.0094974 ] mean value: 0.009831404685974121 key: test_mcc value: [0.96824584 0.93743687 0.93548387 1. 0.96824584 1. 0.90748521 0.93548387 0.87082935 0.96770777] mean value: 0.9490918620004005 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.98387097 0.96774194 0.96774194 1. 0.98387097 1. 0.9516129 0.96774194 0.93442623 0.98360656] mean value: 0.9740613432046537 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.98360656 0.96666667 0.96774194 1. 0.98360656 1. 0.95384615 0.96774194 0.93548387 0.98412698] mean value: 0.9742820661329387 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.96774194 1. 1. 1. 0.91176471 0.96774194 0.90625 0.96875 ] mean value: 0.9722248576850094 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.96774194 0.93548387 0.96774194 1. 0.96774194 1. 1. 0.96774194 0.96666667 1. ] mean value: 0.9773118279569892 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.98387097 0.96774194 0.96774194 1. 0.98387097 1. 0.9516129 0.96774194 0.93494624 0.98333333] mean value: 0.9740860215053764 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.96774194 0.93548387 0.9375 1. 0.96774194 1. 0.91176471 0.9375 0.87878788 0.96875 ] mean value: 0.9505270326605716 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.2 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03455734 0.04919267 0.04343057 0.04365683 0.03343225 0.03359771 0.03364921 0.03464127 0.03343654 0.03420115] mean value: 0.03737955093383789 key: score_time value: [0.01318121 0.01318598 0.02024555 0.01931953 0.01600266 0.03063107 0.01645756 0.01628017 0.0201664 0.01660943] mean value: 0.018207955360412597 key: test_mcc value: [0.50083542 0.42289003 0.54953196 0.7130241 0.80645161 0.42289003 0.48488114 0.64549722 0.50975101 0.55307979] mean value: 0.5608832308024235 key: train_mcc value: [0.70615316 0.78802998 0.84192273 0.84576707 0.86411476 0.83287425 0.82387639 0.82642623 0.81259544 0.6893826 ] mean value: 0.8031142630806966 key: test_accuracy value: [0.74193548 0.70967742 0.77419355 0.85483871 0.90322581 0.70967742 0.74193548 0.82258065 0.75409836 0.7704918 ] mean value: 0.7782654680063459 key: train_accuracy value: [0.83273381 0.88309353 0.92086331 0.92266187 0.93165468 0.91366906 0.90827338 0.90647482 0.89766607 0.82226212] mean value: 0.8939352647146197 key: test_fscore value: [0.7037037 0.68965517 0.78125 0.84745763 0.90322581 0.68965517 0.75 0.81967213 0.73684211 0.75 ] mean value: 0.7671461718512246 key: train_fscore value: [0.79913607 0.86761711 0.92 0.9213894 0.93309859 0.90839695 0.9017341 0.8972332 0.88622754 0.7833698 ] mean value: 0.8818202765481856 key: test_precision value: [0.82608696 0.74074074 0.75757576 0.89285714 0.90322581 0.74074074 0.72727273 0.83333333 0.77777778 0.84 ] mean value: 0.8039610983271572 key: train_precision value: [1. 1. 0.93014706 0.93680297 0.9137931 0.96747967 0.97095436 0.99561404 1. 1. ] mean value: 0.9714791202980441 key: test_recall value: [0.61290323 0.64516129 0.80645161 0.80645161 0.90322581 0.64516129 0.77419355 0.80645161 0.7 0.67741935] mean value: 0.7377419354838709 key: train_recall value: [0.66546763 0.76618705 0.91007194 0.90647482 0.95323741 0.85611511 0.84172662 0.81654676 0.79569892 0.64388489] mean value: 0.815541115494701 key: test_roc_auc value: [0.74193548 0.70967742 0.77419355 0.85483871 0.90322581 0.70967742 0.74193548 0.82258065 0.75322581 0.77204301] mean value: 0.7783333333333333 key: train_roc_auc value: [0.83273381 0.88309353 0.92086331 0.92266187 0.93165468 0.91366906 0.90827338 0.90647482 0.89784946 0.82194245] mean value: 0.8939216368840411 key: test_jcc value: [0.54285714 0.52631579 0.64102564 0.73529412 0.82352941 0.52631579 0.6 0.69444444 0.58333333 0.6 ] mean value: 0.6273115670019694 key: train_jcc value: [0.66546763 0.76618705 0.85185185 0.85423729 0.87458746 0.83216783 0.82105263 0.81362007 0.79569892 0.64388489] mean value: 0.7918755627241194 MCC on Blind test: 0.03 Accuracy on Blind test: 0.51 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.02617693 0.03780103 0.03633928 0.03645992 0.0377748 0.03427029 0.03479242 0.03106213 0.01793814 0.0177362 ] mean value: 0.03103511333465576 key: score_time value: [0.02564049 0.0231626 0.02571344 0.02222896 0.0219202 0.02516079 0.0265336 0.01335716 0.02909446 0.01307034] mean value: 0.022588205337524415 key: test_mcc value: [0.90369611 0.90369611 0.83914639 0.90369611 0.90369611 0.93548387 0.82199494 0.83914639 0.67858574 0.83984455] mean value: 0.8568986328829187 key: train_mcc value: [0.93987712 0.94305636 0.93214329 0.94986154 0.95705746 0.94634322 0.94634322 0.9393413 0.94643646 0.94277021] mean value: 0.9443230181420963 key: test_accuracy value: [0.9516129 0.9516129 0.91935484 0.9516129 0.9516129 0.96774194 0.90322581 0.91935484 0.83606557 0.91803279] mean value: 0.9270227392913802 key: train_accuracy value: [0.96942446 0.97122302 0.96582734 0.97482014 0.97841727 0.97302158 0.97302158 0.96942446 0.97307002 0.97127469] mean value: 0.9719524559885305 key: test_fscore value: [0.95238095 0.95238095 0.91803279 0.95238095 0.95238095 0.96774194 0.91176471 0.91803279 0.84375 0.92307692] mean value: 0.9291922947737448 key: train_fscore/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:195: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_orig.py:198: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) value: [0.97012302 0.97173145 0.96637168 0.97508897 0.97864769 0.97335702 0.97335702 0.9699115 0.97345133 0.97153025] mean value: 0.9723569920770859 key: test_precision value: [0.9375 0.9375 0.93333333 0.9375 0.9375 0.96774194 0.83783784 0.93333333 0.79411765 0.88235294] mean value: 0.909871702822367 key: train_precision value: [0.94845361 0.95486111 0.95121951 0.96478873 0.96830986 0.96140351 0.96140351 0.95470383 0.96153846 0.96126761] mean value: 0.9587949740571688 key: test_recall value: [0.96774194 0.96774194 0.90322581 0.96774194 0.96774194 0.96774194 1. 0.90322581 0.9 0.96774194] mean value: 0.9512903225806452 key: train_recall value: [0.99280576 0.98920863 0.98201439 0.98561151 0.98920863 0.98561151 0.98561151 0.98561151 0.98566308 0.98201439] mean value: 0.9863360924163894 key: test_roc_auc value: [0.9516129 0.9516129 0.91935484 0.9516129 0.9516129 0.96774194 0.90322581 0.91935484 0.83709677 0.9172043 ] mean value: 0.9270430107526882 key: train_roc_auc value: [0.96942446 0.97122302 0.96582734 0.97482014 0.97841727 0.97302158 0.97302158 0.96942446 0.97304737 0.97129393] mean value: 0.9719521157267734 key: test_jcc value: [0.90909091 0.90909091 0.84848485 0.90909091 0.90909091 0.9375 0.83783784 0.84848485 0.72972973 0.85714286] mean value: 0.8695543758043758 key: train_jcc value: [0.94197952 0.94501718 0.93493151 0.95138889 0.95818815 0.94809689 0.94809689 0.94158076 0.94827586 0.94463668] mean value: 0.9462192321272894 MCC on Blind test: 0.14 Accuracy on Blind test: 0.45 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=168)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.19797468 0.3235662 0.41162133 0.29132533 0.34859681 0.40734649 0.45757914 0.48200107 0.28318882 0.37048697] mean value: 0.3573686838150024 key: score_time value: [0.02385783 0.01895332 0.01230931 0.01896501 0.02679181 0.02773142 0.02564216 0.02602696 0.0194571 0.01240873] mean value: 0.02121436595916748 key: test_mcc value: [0.90369611 0.90369611 0.83914639 0.90369611 0.87278605 0.93548387 0.82199494 0.83914639 0.71525965 0.83984455] mean value: 0.8574750173276433 key: train_mcc value: [0.93987712 0.94305636 0.93214329 0.94986154 0.95693359 0.94634322 0.94634322 0.9393413 0.94643646 0.94277021] mean value: 0.9443106311332057 key: test_accuracy value: [0.9516129 0.9516129 0.91935484 0.9516129 0.93548387 0.96774194 0.90322581 0.91935484 0.85245902 0.91803279] mean value: 0.9270491803278689 key: train_accuracy value: [0.96942446 0.97122302 0.96582734 0.97482014 0.97841727 0.97302158 0.97302158 0.96942446 0.97307002 0.97127469] mean value: 0.9719524559885305 key: test_fscore value: [0.95238095 0.95238095 0.91803279 0.95238095 0.93333333 0.96774194 0.91176471 0.91803279 0.86153846 0.92307692] mean value: 0.9290663790228291 key: train_fscore value: [0.97012302 0.97173145 0.96637168 0.97508897 0.97857143 0.97335702 0.97335702 0.9699115 0.97345133 0.97153025] mean value: 0.9723493662509547 key: test_precision value: [0.9375 0.9375 0.93333333 0.9375 0.96551724 0.96774194 0.83783784 0.93333333 0.8 0.88235294] mean value: 0.9132616622544156 key: train_precision value: [0.94845361 0.95486111 0.95121951 0.96478873 0.97163121 0.96140351 0.96140351 0.95470383 0.96153846 0.96126761] mean value: 0.9591271087090518 key: test_recall value: [0.96774194 0.96774194 0.90322581 0.96774194 0.90322581 0.96774194 1. 0.90322581 0.93333333 0.96774194] mean value: 0.9481720430107528 key: train_recall value: [0.99280576 0.98920863 0.98201439 0.98561151 0.98561151 0.98561151 0.98561151 0.98561151 0.98566308 0.98201439] mean value: 0.9859763801861736 key: test_roc_auc value: [0.9516129 0.9516129 0.91935484 0.9516129 0.93548387 0.96774194 0.90322581 0.91935484 0.85376344 0.9172043 ] mean value: 0.9270967741935484 key: train_roc_auc value: [0.96942446 0.97122302 0.96582734 0.97482014 0.97841727 0.97302158 0.97302158 0.96942446 0.97304737 0.97129393] mean value: 0.9719521157267734 key: test_jcc value: [0.90909091 0.90909091 0.84848485 0.90909091 0.875 0.9375 0.83783784 0.84848485 0.75675676 0.85714286] mean value: 0.8688479875979875 key: train_jcc value: [0.94197952 0.94501718 0.93493151 0.95138889 0.95804196 0.94809689 0.94809689 0.94158076 0.94827586 0.94463668] mean value: 0.9462046126004747 MCC on Blind test: 0.14 Accuracy on Blind test: 0.45