/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_sl.py:549: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. from pandas import MultiIndex, Int64Index /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( 1.22.4 1.4.1 aaindex_df contains non-numerical data Total no. of non-numerial columns: 2 Selecting numerical data only PASS: successfully selected numerical columns only for aaindex_df Now checking for NA in the remaining aaindex_cols Counting aaindex_df cols with NA ncols with NA: 4 columns Dropping these... Original ncols: 127 Revised df ncols: 123 Checking NA in revised df... PASS: cols with NA successfully dropped from aaindex_df Proceeding with combining aa_df with other features_df PASS: ncols match Expected ncols: 123 Got: 123 Total no. of columns in clean aa_df: 123 Proceeding to merge, expected nrows in merged_df: 424 PASS: my_features_df and aa_df successfully combined nrows: 424 ncols: 265 count of NULL values before imputation or_mychisq 102 log10_or_mychisq 102 dtype: int64 count of NULL values AFTER imputation mutationinformation 0 or_rawI 0 logorI 0 dtype: int64 PASS: OR values imputed, data ready for ML Total no. of features for aaindex: 123 No. of numerical features: 166 No. of categorical features: 7 PASS: x_features has no target variable No. of columns for x_features: 173 ------------------------------------------------------------- Successfully split data according to scaling law: 1/np.sqrt(x_ncols) Train data size: (170, 173) Test data size: 0.07602859212697055 (15, 173) y_train numbers: Counter({1: 105, 0: 65}) y_train ratio: 0.6190476190476191 y_test_numbers: Counter({1: 9, 0: 6}) y_test ratio: 0.6666666666666666 ------------------------------------------------------------- Simple Random OverSampling Counter({0: 105, 1: 105}) (210, 173) Simple Random UnderSampling Counter({0: 65, 1: 65}) (130, 173) Simple Combined Over and UnderSampling Counter({0: 105, 1: 105}) (210, 173) SMOTE_NC OverSampling Counter({0: 105, 1: 105}) (210, 173) ##################################################################### Running ML analysis: scaling law split Gene name: pncA Drug name: pyrazinamide Output directory: /home/tanu/git/Data/pyrazinamide/output/ml/tts_sl/ Sanity checks: ML source data size: (185, 173) Total input features: (170, 173) Target feature numbers: Counter({1: 105, 0: 65}) Target features ratio: 0.6190476190476191 ##################################################################### ================================================================ Strucutral features (n): 34 These are: Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts'] FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss'] Other struc columns: ['rsa', 'kd_values', 'rd_values'] ================================================================ AAindex features (n): 123 ================================================================ Evolutionary features (n): 3 These are: ['consurf_score', 'snap2_score', 'provean_score'] ================================================================ Genomic features (n): 6 These are: ['maf', 'logorI'] ['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'] ================================================================ Categorical features (n): 7 These are: ['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'] ================================================================ Pass: No. of features match ##################################################################### Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.030159 0.03234315 0.06500244 0.0333209 0.03272486 0.04780293 0.03167248 0.15353703 0.06054592 0.03304076] mean value: 0.052014946937561035 key: score_time value: [0.01172686 0.01196408 0.02330327 0.01240492 0.01344848 0.01356339 0.01245809 0.01469016 0.01229882 0.01356554] mean value: 0.013942360877990723 key: test_mcc value: [0.63262663 0.66299354 0.77151675 0.38122129 0.66299354 0.63262663 0.04351941 0.33371191 0.60385964 0.30389487] mean value: 0.5028964212212766 key: train_mcc value: [0.83628052 0.87638923 0.80724696 0.80552514 0.7938003 0.76492233 0.79554375 0.82448293 0.83762196 0.79554375] mean value: 0.8137356868892114 key: test_accuracy value: [0.82352941 0.82352941 0.88235294 0.70588235 0.82352941 0.82352941 0.52941176 0.70588235 0.82352941 0.70588235] mean value: 0.7647058823529411 key: train_accuracy value: [0.92156863 0.94117647 0.90849673 0.90849673 0.90196078 0.88888889 0.90196078 0.91503268 0.92156863 0.90196078] mean value: 0.9111111111111111 /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( key: test_fscore value: [0.85714286 0.86956522 0.90909091 0.7826087 0.86956522 0.85714286 0.6 0.7826087 0.86956522 0.8 ] mean value: 0.8197289666854884 key: train_fscore value: [0.94 0.95431472 0.93 0.92929293 0.92537313 0.91370558 0.92462312 0.93467337 0.93939394 0.92462312] mean value: 0.9315999905573705 key: test_precision value: [0.81818182 0.76923077 0.83333333 0.69230769 0.76923077 0.9 0.66666667 0.75 0.83333333 0.71428571] mean value: 0.7746570096570097 key: train_precision value: [0.8952381 0.92156863 0.88571429 0.89320388 0.87735849 0.87378641 0.87619048 0.88571429 0.89423077 0.87619048] mean value: 0.8879195797557542 key: test_recall value: [0.9 1. 1. 0.9 1. 0.81818182 0.54545455 0.81818182 0.90909091 0.90909091] mean value: 0.88 key: train_recall value: [0.98947368 0.98947368 0.97894737 0.96842105 0.97894737 0.95744681 0.9787234 0.9893617 0.9893617 0.9787234 ] mean value: 0.9798880179171333 key: test_roc_auc value: [0.80714286 0.78571429 0.85714286 0.66428571 0.78571429 0.82575758 0.52272727 0.65909091 0.78787879 0.62121212] mean value: 0.7316666666666667 key: train_roc_auc value: [0.89990926 0.92577132 0.88602541 0.88938294 0.87740472 0.86855391 0.87919221 0.89298594 0.90146051 0.87919221] mean value: 0.8899878429737624 key: test_jcc value: [0.75 0.76923077 0.83333333 0.64285714 0.76923077 0.75 0.42857143 0.64285714 0.76923077 0.66666667] mean value: 0.7021978021978023 key: train_jcc value: [0.88679245 0.91262136 0.86915888 0.86792453 0.86111111 0.8411215 0.85981308 0.87735849 0.88571429 0.85981308] mean value: 0.8721428769802886 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.90216947 0.87260532 1.08087659 0.74459434 1.0994246 1.10648322 1.21706295 0.96376276 0.86980605 0.99782419] mean value: 0.9854609489440918 key: score_time value: [0.01360488 0.01396418 0.01334357 0.01384592 0.01390672 0.01344657 0.01328254 0.0135169 0.01473761 0.01464629] mean value: 0.013829517364501952 key: test_mcc value: [0.51428571 0.50920105 0.30988989 0.51428571 0.50920105 0.88273483 0.2030906 0.48484848 0.74242424 0.48484848] mean value: 0.5154810071962633 key: train_mcc value: [1. 1. 1. 1. 0.91830889 0.98625704 0.90411865 1. 0.89069566 1. ] mean value: 0.96993802415093 key: test_accuracy value: [0.76470588 0.76470588 0.64705882 0.76470588 0.76470588 0.94117647 0.58823529 0.76470588 0.88235294 0.76470588] mean value: 0.7647058823529411 key: train_accuracy value: [1. 1. 1. 1. 0.96078431 0.99346405 0.95424837 1. 0.94771242 1. ] mean value: 0.9856209150326798 key: test_fscore value: [0.8 0.81818182 0.66666667 0.8 0.81818182 0.95238095 0.63157895 0.81818182 0.90909091 0.81818182] mean value: 0.8032444748234222 key: train_fscore value: [1. 1. 1. 1. 0.96938776 0.99470899 0.96373057 1. 0.95876289 1. ] mean value: 0.988659020635716 key: test_precision value: [0.8 0.75 0.75 0.8 0.75 1. 0.75 0.81818182 0.90909091 0.81818182] mean value: 0.8145454545454546 key: train_precision value: [1. 1. 1. 1. 0.94059406 0.98947368 0.93939394 1. 0.93 1. ] mean value: 0.9799461683010406 key: test_recall value: [0.8 0.9 0.6 0.8 0.9 0.90909091 0.54545455 0.81818182 0.90909091 0.81818182] mean value: 0.8 key: train_recall value: [1. 1. 1. 1. 1. 1. 0.9893617 1. 0.9893617 1. ] mean value: 0.997872340425532 key: test_roc_auc value: [0.75714286 0.73571429 0.65714286 0.75714286 0.73571429 0.95454545 0.60606061 0.74242424 0.87121212 0.74242424] mean value: 0.755952380952381 key: train_roc_auc value: [1. 1. 1. 1. 0.94827586 0.99152542 0.94383339 1. 0.93535882 1. ] mean value: 0.9818993496400015 key: test_jcc value: [0.66666667 0.69230769 0.5 0.66666667 0.69230769 0.90909091 0.46153846 0.69230769 0.83333333 0.69230769] mean value: 0.6806526806526807 key: train_jcc value: [1. 1. 1. 1. 0.94059406 0.98947368 0.93 1. 0.92079208 1. ] mean value: 0.9780859822824388 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01363158 0.01329041 0.00920653 0.01015902 0.00940776 0.00897503 0.00879431 0.00902987 0.01002741 0.00941849] mean value: 0.010194039344787598 key: score_time value: [0.01313639 0.01162434 0.00912452 0.01039958 0.00888896 0.00871277 0.00869632 0.00936913 0.00976944 0.00974512] mean value: 0.009946656227111817 key: test_mcc value: [ 0.38122129 0.50920105 0.77151675 0.13241022 0.24688536 -0.01899343 0.22727273 -0.01899343 0.17069719 0.17069719] mean value: 0.25719149163785127 key: train_mcc value: [0.5048764 0.47629849 0.5048764 0.45884418 0.48537027 0.43135777 0.49226514 0.47721276 0.44691625 0.39720759] mean value: 0.46752252677089945 key: test_accuracy value: [0.70588235 0.76470588 0.88235294 0.58823529 0.64705882 0.58823529 0.64705882 0.58823529 0.64705882 0.64705882] mean value: 0.6705882352941177 key: train_accuracy value: [0.77124183 0.75816993 0.77124183 0.75163399 0.76470588 0.71895425 0.76470588 0.75816993 0.74509804 0.68627451] mean value: 0.7490196078431373 key: test_fscore value: [0.7826087 0.81818182 0.90909091 0.66666667 0.72727273 0.72 0.72727273 0.72 0.75 0.75 ] mean value: 0.7571093544137023 key: train_fscore value: [0.82233503 0.81218274 0.82233503 0.81 0.82524272 0.81385281 0.81818182 0.81407035 0.80597015 0.71084337] mean value: 0.8055014016865908 key: test_precision value: [0.69230769 0.75 0.83333333 0.63636364 0.66666667 0.64285714 0.72727273 0.64285714 0.69230769 0.69230769] mean value: 0.6976273726273726 key: train_precision value: [0.79411765 0.78431373 0.79411765 0.77142857 0.76576577 0.68613139 0.77884615 0.77142857 0.75700935 0.81944444] mean value: 0.7722603259177057 key: test_recall value: [0.9 0.9 1. 0.7 0.8 0.81818182 0.72727273 0.81818182 0.81818182 0.81818182] mean value: 0.8300000000000001 key: train_recall value: [0.85263158 0.84210526 0.85263158 0.85263158 0.89473684 1. 0.86170213 0.86170213 0.86170213 0.62765957] mean value: 0.8507502799552071 key: test_roc_auc value: [0.66428571 0.73571429 0.85714286 0.56428571 0.61428571 0.49242424 0.61363636 0.49242424 0.57575758 0.57575758] mean value: 0.6185714285714285 key: train_roc_auc value: [0.74528131 0.73139746 0.74528131 0.71941924 0.72323049 0.63559322 0.73593581 0.72746123 0.71051208 0.7036603 ] mean value: 0.7177772440103329 key: test_jcc value: [0.64285714 0.69230769 0.83333333 0.5 0.57142857 0.5625 0.57142857 0.5625 0.6 0.6 ] mean value: 0.6136355311355312 key: train_jcc value: [0.69827586 0.68376068 0.69827586 0.68067227 0.70247934 0.68613139 0.69230769 0.68644068 0.675 0.55140187] mean value: 0.675474564194314 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.00986075 0.01002407 0.00997305 0.01180744 0.01076865 0.01062441 0.01006889 0.00927901 0.01006889 0.00926876] mean value: 0.010174393653869629 key: score_time value: [0.00983071 0.00966644 0.00964904 0.01030135 0.01011753 0.00926757 0.00965786 0.00972462 0.00960636 0.00913143] mean value: 0.009695291519165039 key: test_mcc value: [ 0.23975611 0.38251843 0.63262663 0.02857143 0.50920105 0.2030906 -0.13241022 -0.28787879 0.13241022 0.17069719] mean value: 0.18785826455207946 key: train_mcc value: [0.42092813 0.39333516 0.32656704 0.46856319 0.39056476 0.41094842 0.47583844 0.38542713 0.43627743 0.37735366] mean value: 0.40858033755597606 key: test_accuracy value: [0.64705882 0.70588235 0.82352941 0.52941176 0.76470588 0.58823529 0.41176471 0.41176471 0.58823529 0.64705882] mean value: 0.611764705882353 key: train_accuracy value: [0.7254902 0.7124183 0.69281046 0.74509804 0.71895425 0.7254902 0.75163399 0.70588235 0.73202614 0.70588235] mean value: 0.7215686274509804 key: test_fscore value: [0.75 0.76190476 0.85714286 0.6 0.81818182 0.63157895 0.44444444 0.54545455 0.66666667 0.75 ] mean value: 0.6825374041163514 key: train_fscore value: [0.77659574 0.76595745 0.76616915 0.78918919 0.78172589 0.78350515 0.79787234 0.75675676 0.78074866 0.76190476] mean value: 0.776042510006011 key: test_precision value: [0.64285714 0.72727273 0.81818182 0.6 0.75 0.75 0.57142857 0.54545455 0.7 0.69230769] mean value: 0.6797502497502498 key: train_precision value: [0.78494624 0.77419355 0.72641509 0.81111111 0.75490196 0.76 0.79787234 0.76923077 0.78494624 0.75789474] mean value: 0.772151203423883 key: test_recall value: [0.9 0.8 0.9 0.6 0.9 0.54545455 0.36363636 0.54545455 0.63636364 0.81818182] mean value: 0.7009090909090909 key: train_recall value: [0.76842105 0.75789474 0.81052632 0.76842105 0.81052632 0.80851064 0.79787234 0.74468085 0.77659574 0.76595745] mean value: 0.7809406494960806 key: test_roc_auc value: [0.59285714 0.68571429 0.80714286 0.51428571 0.73571429 0.60606061 0.43181818 0.35606061 0.56818182 0.57575758] mean value: 0.5873593073593074 key: train_roc_auc value: [0.71179673 0.69791289 0.65526316 0.7376588 0.68974592 0.70086549 0.73791922 0.69437432 0.71880635 0.68806347] mean value: 0.7032406345084143 key: test_jcc value: [0.6 0.61538462 0.75 0.42857143 0.69230769 0.46153846 0.28571429 0.375 0.5 0.6 ] mean value: 0.5308516483516483 key: train_jcc value: [0.63478261 0.62068966 0.62096774 0.65178571 0.64166667 0.6440678 0.66371681 0.60869565 0.64035088 0.61538462] mean value: 0.6342108142276903 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00887656 0.01209188 0.0101974 0.00957799 0.00946403 0.00974488 0.0111022 0.00958228 0.01056123 0.01019073] mean value: 0.010138916969299316 key: score_time value: [0.05371165 0.02205682 0.01622033 0.01528668 0.01598525 0.01636815 0.01718068 0.01583552 0.01490521 0.01607275] mean value: 0.020362305641174316 key: test_mcc value: [ 0.13241022 -0.11769798 -0.46409548 -0.07377111 -0.38729833 0.13241022 -0.11948803 -0.28787879 0.06356417 -0.11769798] mean value: -0.12395430776647315 key: train_mcc value: [0.40852687 0.3435988 0.41056782 0.3789188 0.30157232 0.3679126 0.34836646 0.43470567 0.35262985 0.35371983] mean value: 0.37005190349511996 key: test_accuracy value: [0.58823529 0.47058824 0.35294118 0.52941176 0.41176471 0.58823529 0.52941176 0.41176471 0.58823529 0.47058824] mean value: 0.49411764705882355 key: train_accuracy value: [0.73202614 0.70588235 0.73202614 0.71895425 0.68627451 0.7124183 0.69934641 0.73856209 0.70588235 0.70588235] mean value: 0.7137254901960784 key: test_fscore value: [0.66666667 0.57142857 0.52173913 0.66666667 0.58333333 0.66666667 0.66666667 0.54545455 0.69565217 0.57142857] mean value: 0.6155702992659514 key: train_fscore value: [0.80382775 0.79069767 0.8 0.79227053 0.76923077 0.79047619 0.76767677 0.7979798 0.784689 0.7826087 ] mean value: 0.7879457173246753 key: test_precision value: [0.63636364 0.54545455 0.46153846 0.57142857 0.5 0.7 0.61538462 0.54545455 0.66666667 0.6 ] mean value: 0.5842291042291042 key: train_precision value: [0.73684211 0.70833333 0.74545455 0.73214286 0.7079646 0.71551724 0.73076923 0.75961538 0.71304348 0.71681416] mean value: 0.7266496937280635 key: test_recall value: [0.7 0.6 0.6 0.8 0.7 0.63636364 0.72727273 0.54545455 0.72727273 0.54545455] mean value: 0.6581818181818182 key: train_recall value: [0.88421053 0.89473684 0.86315789 0.86315789 0.84210526 0.88297872 0.80851064 0.84042553 0.87234043 0.86170213] mean value: 0.8613325867861142 key: test_roc_auc value: [0.56428571 0.44285714 0.3 0.47142857 0.35 0.56818182 0.4469697 0.35606061 0.53030303 0.43939394] mean value: 0.4469480519480519 key: train_roc_auc value: [0.68348457 0.64564428 0.69019964 0.67295826 0.63656987 0.66182834 0.66696718 0.70834836 0.6565092 0.65966462] mean value: 0.6682174330774522 key: test_jcc value: [0.5 0.4 0.35294118 0.5 0.41176471 0.5 0.5 0.375 0.53333333 0.4 ] mean value: 0.44730392156862747 key: train_jcc value: [0.672 0.65384615 0.66666667 0.656 0.625 0.65354331 0.62295082 0.66386555 0.64566929 0.64285714] mean value: 0.6502398927685779 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01092124 0.01062822 0.01066852 0.0106132 0.01072216 0.01072073 0.0104661 0.01049161 0.01138711 0.01187062] mean value: 0.01084895133972168 key: score_time value: [0.00920558 0.0091362 0.00914645 0.00922441 0.00928783 0.00917053 0.00909495 0.00927687 0.00960684 0.00933766] mean value: 0.009248733520507812 key: test_mcc value: [ 0.29880715 0.43643578 0.29880715 0.09944903 0.06546537 0.11236664 -0.01899343 -0.01899343 0.3385016 0.11236664] mean value: 0.17242125126907817 key: train_mcc value: [0.52447344 0.48191696 0.45248357 0.59244006 0.56560446 0.58429818 0.58307945 0.59739548 0.51726562 0.57111391] mean value: 0.5470071110789382 key: test_accuracy value: [0.64705882 0.70588235 0.64705882 0.58823529 0.58823529 0.64705882 0.58823529 0.58823529 0.70588235 0.64705882] mean value: 0.6352941176470589 key: train_accuracy value: [0.76470588 0.74509804 0.73202614 0.79738562 0.78431373 0.79084967 0.79738562 0.79738562 0.75816993 0.78431373] mean value: 0.7751633986928105 key: test_fscore value: [0.76923077 0.8 0.76923077 0.69565217 0.72 0.76923077 0.72 0.72 0.81481481 0.76923077] mean value: 0.7547390065650935 key: train_fscore value: [0.84070796 0.82969432 0.82251082 0.85972851 0.85201794 0.85454545 0.85581395 0.85844749 0.83555556 0.85067873] mean value: 0.8459700739469289 key: test_precision value: [0.625 0.66666667 0.625 0.61538462 0.6 0.66666667 0.64285714 0.64285714 0.6875 0.66666667] mean value: 0.6438598901098901 key: train_precision value: [0.72519084 0.70895522 0.69852941 0.75396825 0.7421875 0.74603175 0.76033058 0.752 0.71755725 0.74015748] mean value: 0.7344908286075713 key: test_recall value: [1. 1. 1. 0.8 0.9 0.90909091 0.81818182 0.81818182 1. 0.90909091] mean value: 0.9154545454545455 key: train_recall value: [1. 1. 1. 1. 1. 1. 0.9787234 1. 1. 1. ] mean value: 0.9978723404255319 key: test_roc_auc value: [0.57142857 0.64285714 0.57142857 0.54285714 0.52142857 0.53787879 0.49242424 0.49242424 0.58333333 0.53787879] mean value: 0.5493939393939393 key: train_roc_auc value: [0.68965517 0.6637931 0.64655172 0.73275862 0.71551724 0.72881356 0.74359899 0.73728814 0.68644068 0.72033898] mean value: 0.7064756208264422 key: test_jcc value: [0.625 0.66666667 0.625 0.53333333 0.5625 0.625 0.5625 0.5625 0.6875 0.625 ] mean value: 0.6075 key: train_jcc value: [0.72519084 0.70895522 0.69852941 0.75396825 0.7421875 0.74603175 0.74796748 0.752 0.71755725 0.74015748] mean value: 0.7332545187238113 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [0.8853507 0.85311651 0.85744238 1.03879786 1.08934593 1.1350615 0.69020605 0.69074345 1.01780343 1.07197356] mean value: 0.9329841375350952 key: score_time value: [0.01643276 0.01357484 0.01391101 0.01479197 0.01400852 0.01294899 0.01254201 0.01256633 0.02390885 0.01315498] mean value: 0.014784026145935058 key: test_mcc value: [ 0.38251843 0.63262663 0.50920105 0.13241022 -0.01543033 0.69631062 0.29012943 0.17069719 0.74242424 -0.01899343] mean value: 0.3521894047164522 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.70588235 0.82352941 0.76470588 0.58823529 0.52941176 0.82352941 0.64705882 0.64705882 0.88235294 0.58823529] mean value: 0.7 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.76190476 0.85714286 0.81818182 0.66666667 0.63636364 0.84210526 0.7 0.75 0.90909091 0.72 ] mean value: 0.7661455912508545 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.72727273 0.81818182 0.75 0.63636364 0.58333333 1. 0.77777778 0.69230769 0.90909091 0.64285714] mean value: 0.7537185037185037 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.9 0.9 0.7 0.7 0.72727273 0.63636364 0.81818182 0.90909091 0.81818182] mean value: 0.7909090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.68571429 0.80714286 0.73571429 0.56428571 0.49285714 0.86363636 0.65151515 0.57575758 0.87121212 0.49242424] mean value: 0.6740259740259741 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.61538462 0.75 0.69230769 0.5 0.46666667 0.72727273 0.53846154 0.6 0.83333333 0.5625 ] mean value: 0.6285926573426573 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.022614 0.01708698 0.01282835 0.01164603 0.01256561 0.01273346 0.01293755 0.0132103 0.01343513 0.01459908] mean value: 0.014365649223327637 key: score_time value: [0.02368164 0.00920272 0.0085516 0.0085299 0.00866437 0.00875759 0.00901079 0.0088861 0.00909567 0.00948334] mean value: 0.010386371612548828 key: test_mcc value: [0.38251843 0.77151675 0.7 0.51428571 0.88273483 0.88273483 1. 0.69631062 0.60385964 0.88273483] mean value: 0.731669564240748 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.70588235 0.88235294 0.82352941 0.76470588 0.94117647 0.94117647 1. 0.82352941 0.82352941 0.94117647] mean value: 0.8647058823529412 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.76190476 0.90909091 0.82352941 0.8 0.95238095 0.95238095 1. 0.84210526 0.86956522 0.95238095] mean value: 0.8863338420452433 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.72727273 0.83333333 1. 0.8 0.90909091 1. 1. 1. 0.83333333 1. ] mean value: 0.9103030303030303 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 1. 0.7 0.8 1. 0.90909091 1. 0.72727273 0.90909091 0.90909091] mean value: 0.8754545454545455 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.68571429 0.85714286 0.85 0.75714286 0.92857143 0.95454545 1. 0.86363636 0.78787879 0.95454545] mean value: 0.863917748917749 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.61538462 0.83333333 0.7 0.66666667 0.90909091 0.90909091 1. 0.72727273 0.76923077 0.90909091] mean value: 0.8039160839160839 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.10089493 0.09773612 0.09653449 0.09245515 0.0937891 0.09260631 0.09319448 0.09338927 0.09292459 0.09129429] mean value: 0.09448187351226807 key: score_time value: [0.01952076 0.01778722 0.01751566 0.01846743 0.01758528 0.01765728 0.01768947 0.01766562 0.01741314 0.01764417] mean value: 0.017894601821899413 key: test_mcc value: [0.38122129 0.77151675 0.50920105 0.27142857 0.24688536 0.63262663 0.22727273 0.11236664 0.60385964 0.22727273] mean value: 0.3983651389885671 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.70588235 0.88235294 0.76470588 0.64705882 0.64705882 0.82352941 0.64705882 0.64705882 0.82352941 0.64705882] mean value: 0.7235294117647059 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.7826087 0.90909091 0.81818182 0.7 0.72727273 0.85714286 0.72727273 0.76923077 0.86956522 0.72727273] mean value: 0.7887638448508013 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.69230769 0.83333333 0.75 0.7 0.66666667 0.9 0.72727273 0.66666667 0.83333333 0.72727273] mean value: 0.7496853146853146 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 0.9 0.7 0.8 0.81818182 0.72727273 0.90909091 0.90909091 0.72727273] mean value: 0.8390909090909091 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.66428571 0.85714286 0.73571429 0.63571429 0.61428571 0.82575758 0.61363636 0.53787879 0.78787879 0.61363636] mean value: 0.6885930735930736 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.64285714 0.83333333 0.69230769 0.53846154 0.57142857 0.75 0.57142857 0.625 0.76923077 0.57142857] mean value: 0.656547619047619 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00948 0.01067305 0.00918198 0.00951219 0.0090723 0.00886846 0.00886393 0.00893688 0.00908589 0.01140523] mean value: 0.009507989883422852 key: score_time value: [0.00905252 0.00954819 0.00903249 0.00884628 0.00869274 0.00862312 0.00868821 0.00864053 0.00882077 0.00961161] mean value: 0.008955645561218261 key: test_mcc value: [ 0.11769798 -0.27774603 0.27142857 0.13241022 0.38122129 0.22727273 0.33371191 0.38251843 0.29012943 0.22727273] mean value: 0.20859172445585672 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.52941176 0.41176471 0.64705882 0.58823529 0.70588235 0.64705882 0.70588235 0.70588235 0.64705882 0.64705882] mean value: 0.6235294117647059 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.5 0.54545455 0.7 0.66666667 0.7826087 0.72727273 0.7826087 0.76190476 0.7 0.72727273] mean value: 0.6893788819875777 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.66666667 0.5 0.7 0.63636364 0.69230769 0.72727273 0.75 0.8 0.77777778 0.72727273] mean value: 0.6977661227661227 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.4 0.6 0.7 0.7 0.9 0.72727273 0.81818182 0.72727273 0.63636364 0.72727273] mean value: 0.6936363636363636 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.55714286 0.37142857 0.63571429 0.56428571 0.66428571 0.61363636 0.65909091 0.6969697 0.65151515 0.61363636] mean value: 0.6027705627705628 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.33333333 0.375 0.53846154 0.5 0.64285714 0.57142857 0.64285714 0.61538462 0.53846154 0.57142857] mean value: 0.5329212454212454 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.27949595 1.2522831 1.23157477 1.28226137 1.31405354 1.33238959 1.32696438 1.29168916 1.21863532 1.26388741] mean value: 1.2793234586715698 key: score_time value: [0.10485697 0.09505439 0.0964644 0.17082667 0.10032129 0.09584975 0.09731865 0.12909579 0.12827754 0.13281274] mean value: 0.11508781909942627 key: test_mcc value: [0.66299354 0.66299354 0.63262663 0.50920105 0.77151675 0.63262663 0.87400737 0.4608824 0.60385964 0.60385964] mean value: 0.6414567202607909 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.82352941 0.82352941 0.82352941 0.76470588 0.88235294 0.82352941 0.94117647 0.76470588 0.82352941 0.82352941] mean value: 0.8294117647058823 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.86956522 0.86956522 0.85714286 0.81818182 0.90909091 0.85714286 0.95652174 0.83333333 0.86956522 0.86956522] mean value: 0.8709674383587427 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.76923077 0.76923077 0.81818182 0.75 0.83333333 0.9 0.91666667 0.76923077 0.83333333 0.83333333] mean value: 0.8192540792540792 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.9 0.9 1. 0.81818182 1. 0.90909091 0.90909091 0.90909091] mean value: 0.9345454545454546 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.78571429 0.78571429 0.80714286 0.73571429 0.85714286 0.82575758 0.91666667 0.70454545 0.78787879 0.78787879] mean value: 0.7994155844155845 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.76923077 0.76923077 0.75 0.69230769 0.83333333 0.75 0.91666667 0.71428571 0.76923077 0.76923077] mean value: 0.7733516483516484 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( key: fit_time value: [1.94012594 0.93395567 0.88578105 0.94665599 1.04666948 1.01622057 1.05611706 1.00596952 1.00941062 1.69974065] mean value: 1.154064655303955 key: score_time value: [0.16018629 0.12835646 0.13689685 0.14965868 0.14444923 0.15999627 0.15356684 0.15862679 0.17733407 0.12240601] mean value: 0.14914774894714355 key: test_mcc value: [0.55328334 0.66299354 0.77151675 0.50920105 0.66299354 0.63262663 0.4608824 0.4608824 0.87400737 0.62678317] mean value: 0.6215170201634855 key: train_mcc value: [0.87638923 0.88986734 0.88986734 0.93172069 0.88986734 0.8640452 0.87733952 0.89069566 0.89069566 0.87733952] mean value: 0.8877827514166594 key: test_accuracy value: [0.76470588 0.82352941 0.88235294 0.76470588 0.82352941 0.82352941 0.76470588 0.76470588 0.94117647 0.82352941] mean value: 0.8176470588235294 key: train_accuracy value: [0.94117647 0.94771242 0.94771242 0.96732026 0.94771242 0.93464052 0.94117647 0.94771242 0.94771242 0.94117647] mean value: 0.9464052287581699 key: test_fscore value: [0.83333333 0.86956522 0.90909091 0.81818182 0.86956522 0.85714286 0.83333333 0.83333333 0.95652174 0.88 ] mean value: 0.8660067758328628 key: train_fscore value: [0.95431472 0.95918367 0.95918367 0.97435897 0.95918367 0.94897959 0.95384615 0.95876289 0.95876289 0.95384615] mean value: 0.9580422388304239 key: test_precision value: [0.71428571 0.76923077 0.83333333 0.75 0.76923077 0.9 0.76923077 0.76923077 0.91666667 0.78571429] mean value: 0.7976923076923077 key: train_precision value: [0.92156863 0.93069307 0.93069307 0.95 0.93069307 0.91176471 0.92079208 0.93 0.93 0.92079208] mean value: 0.9276996699669967 key: test_recall value: [1. 1. 1. 0.9 1. 0.81818182 0.90909091 0.90909091 1. 1. ] mean value: 0.9536363636363636 key: train_recall value: [0.98947368 0.98947368 0.98947368 1. 0.98947368 0.9893617 0.9893617 0.9893617 0.9893617 0.9893617 ] mean value: 0.9904703247480403 key: test_roc_auc value: [0.71428571 0.78571429 0.85714286 0.73571429 0.78571429 0.82575758 0.70454545 0.70454545 0.91666667 0.75 ] mean value: 0.778008658008658 key: train_roc_auc value: [0.92577132 0.93439201 0.93439201 0.95689655 0.93439201 0.91840966 0.92688424 0.93535882 0.93535882 0.92688424] mean value: 0.9328739700888068 key: test_jcc value: [0.71428571 0.76923077 0.83333333 0.69230769 0.76923077 0.75 0.71428571 0.71428571 0.91666667 0.78571429] mean value: 0.765934065934066 key: train_jcc value: [0.91262136 0.92156863 0.92156863 0.95 0.92156863 0.90291262 0.91176471 0.92079208 0.92079208 0.91176471] mean value: 0.9195353433116012 MCC on Blind test: 0.74 Accuracy on Blind test: 0.87 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01290655 0.01201439 0.0091145 0.00908995 0.00977802 0.00959659 0.00912237 0.0090313 0.00955582 0.00962853] mean value: 0.00998380184173584 key: score_time value: [0.01208782 0.00901628 0.00966477 0.0090487 0.00951076 0.00898027 0.00883889 0.00868464 0.00920987 0.00892973] mean value: 0.009397172927856445 key: test_mcc value: [ 0.23975611 0.38251843 0.63262663 0.02857143 0.50920105 0.2030906 -0.13241022 -0.28787879 0.13241022 0.17069719] mean value: 0.18785826455207946 key: train_mcc value: [0.42092813 0.39333516 0.32656704 0.46856319 0.39056476 0.41094842 0.47583844 0.38542713 0.43627743 0.37735366] mean value: 0.40858033755597606 key: test_accuracy value: [0.64705882 0.70588235 0.82352941 0.52941176 0.76470588 0.58823529 0.41176471 0.41176471 0.58823529 0.64705882] mean value: 0.611764705882353 key: train_accuracy value: [0.7254902 0.7124183 0.69281046 0.74509804 0.71895425 0.7254902 0.75163399 0.70588235 0.73202614 0.70588235] mean value: 0.7215686274509804 key: test_fscore value: [0.75 0.76190476 0.85714286 0.6 0.81818182 0.63157895 0.44444444 0.54545455 0.66666667 0.75 ] mean value: 0.6825374041163514 key: train_fscore value: [0.77659574 0.76595745 0.76616915 0.78918919 0.78172589 0.78350515 0.79787234 0.75675676 0.78074866 0.76190476] mean value: 0.776042510006011 key: test_precision value: [0.64285714 0.72727273 0.81818182 0.6 0.75 0.75 0.57142857 0.54545455 0.7 0.69230769] mean value: 0.6797502497502498 key: train_precision value: [0.78494624 0.77419355 0.72641509 0.81111111 0.75490196 0.76 0.79787234 0.76923077 0.78494624 0.75789474] mean value: 0.772151203423883 key: test_recall value: [0.9 0.8 0.9 0.6 0.9 0.54545455 0.36363636 0.54545455 0.63636364 0.81818182] mean value: 0.7009090909090909 key: train_recall value: [0.76842105 0.75789474 0.81052632 0.76842105 0.81052632 0.80851064 0.79787234 0.74468085 0.77659574 0.76595745] mean value: 0.7809406494960806 key: test_roc_auc value: [0.59285714 0.68571429 0.80714286 0.51428571 0.73571429 0.60606061 0.43181818 0.35606061 0.56818182 0.57575758] mean value: 0.5873593073593074 key: train_roc_auc value: [0.71179673 0.69791289 0.65526316 0.7376588 0.68974592 0.70086549 0.73791922 0.69437432 0.71880635 0.68806347] mean value: 0.7032406345084143 key: test_jcc value: [0.6 0.61538462 0.75 0.42857143 0.69230769 0.46153846 0.28571429 0.375 0.5 0.6 ] mean value: 0.5308516483516483 key: train_jcc value: [0.63478261 0.62068966 0.62096774 0.65178571 0.64166667 0.6440678 0.66371681 0.60869565 0.64035088 0.61538462] mean value: 0.6342108142276903 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [0.95170927 1.08157611 4.2941637 1.35887098 1.44883847 1.49672079 1.49727535 1.45497322 4.00354838 5.43425512] mean value: 2.3021931409835816 key: score_time value: [0.01140547 0.05233741 0.01315928 0.01244068 0.0132606 0.01195669 0.01259422 0.01315713 0.02558994 0.01518345] mean value: 0.018108487129211426 key: test_mcc value: [0.38122129 0.66299354 0.88741197 0.63262663 0.75714286 0.87400737 1. 0.78334945 0.87400737 0.88273483] mean value: 0.7735495312682757 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.70588235 0.82352941 0.94117647 0.82352941 0.88235294 0.94117647 1. 0.88235294 0.94117647 0.94117647] mean value: 0.888235294117647 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.7826087 0.86956522 0.94736842 0.85714286 0.9 0.95652174 1. 0.9 0.95652174 0.95238095] mean value: 0.912210962188079 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.69230769 0.76923077 1. 0.81818182 0.9 0.91666667 1. 1. 0.91666667 1. ] mean value: 0.9013053613053613 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 0.9 0.9 0.9 1. 1. 0.81818182 1. 0.90909091] mean value: 0.9327272727272727 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.66428571 0.78571429 0.95 0.80714286 0.87857143 0.91666667 1. 0.90909091 0.91666667 0.95454545] mean value: 0.8782683982683983 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.64285714 0.76923077 0.9 0.75 0.81818182 0.91666667 1. 0.81818182 0.91666667 0.90909091] mean value: 0.8440875790875791 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04736233 0.06068015 0.0650456 0.06523633 0.05861163 0.05586934 0.0611794 0.06430578 0.06066442 0.05646658] mean value: 0.059542155265808104 key: score_time value: [0.0297966 0.0236876 0.01204276 0.02082038 0.02062225 0.01970601 0.02349544 0.0230937 0.02011728 0.02406526] mean value: 0.021744728088378906 key: test_mcc value: [0.51428571 0.63262663 0.30988989 0.50920105 0.07042952 0.74242424 0.53673944 0.4608824 0.2030906 0.78334945] mean value: 0.4762918944486056 key: train_mcc value: [0.95830113 0.94445829 0.98616507 0.95830113 1. 0.94483888 0.98625704 0.98625704 1. 0.97261224] mean value: 0.9737190818635186 key: test_accuracy value: [0.76470588 0.82352941 0.64705882 0.76470588 0.52941176 0.88235294 0.76470588 0.76470588 0.58823529 0.88235294] mean value: 0.7411764705882353 key: train_accuracy value: [0.98039216 0.97385621 0.99346405 0.98039216 1. 0.97385621 0.99346405 0.99346405 1. 0.9869281 ] mean value: 0.9875816993464053 key: test_fscore value: [0.8 0.85714286 0.66666667 0.81818182 0.55555556 0.90909091 0.8 0.83333333 0.63157895 0.9 ] mean value: 0.777155008733956 key: train_fscore value: [0.98429319 0.97916667 0.9947644 0.98429319 1. 0.97894737 0.99470899 0.99470899 1. 0.98947368] mean value: 0.9900356494056549 key: test_precision value: [0.8 0.81818182 0.75 0.75 0.625 0.90909091 0.88888889 0.76923077 0.75 1. ] mean value: 0.8060392385392385 key: train_precision value: [0.97916667 0.96907216 0.98958333 0.97916667 1. 0.96875 0.98947368 0.98947368 1. 0.97916667] mean value: 0.9843852866702839 key: test_recall value: [0.8 0.9 0.6 0.9 0.5 0.90909091 0.72727273 0.90909091 0.54545455 0.81818182] mean value: 0.7609090909090909 key: train_recall value: [0.98947368 0.98947368 1. 0.98947368 1. 0.9893617 1. 1. 1. 1. ] mean value: 0.9957782754759239 key: test_roc_auc value: [0.75714286 0.80714286 0.65714286 0.73571429 0.53571429 0.87121212 0.78030303 0.70454545 0.60606061 0.90909091] mean value: 0.7364069264069264 key: train_roc_auc value: [0.97749546 0.96887477 0.99137931 0.97749546 1. 0.96925712 0.99152542 0.99152542 1. 0.98305085] mean value: 0.9850603826239935 key: test_jcc value: [0.66666667 0.75 0.5 0.69230769 0.38461538 0.83333333 0.66666667 0.71428571 0.46153846 0.81818182] mean value: 0.6487595737595737 key: train_jcc value: [0.96907216 0.95918367 0.98958333 0.96907216 1. 0.95876289 0.98947368 0.98947368 1. 0.97916667] mean value: 0.9803788258385285 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02010775 0.00967646 0.00920439 0.00903702 0.00907183 0.00884581 0.00869846 0.0089879 0.0096581 0.00971985] mean value: 0.010300755500793457 key: score_time value: [0.01351142 0.0090692 0.00885296 0.00862312 0.00895929 0.00884748 0.00848365 0.00872898 0.00927663 0.00903773] mean value: 0.009339046478271485 key: test_mcc value: [ 0.55328334 0.50920105 0.66299354 0.02857143 0.24688536 0.22727273 0.04351941 -0.11948803 0.33371191 0.49441323] mean value: 0.29803639728108056 key: train_mcc value: [0.42621329 0.33692443 0.38059794 0.36668738 0.39480728 0.43071005 0.43299259 0.3747783 0.37096514 0.38834821] mean value: 0.3903024610468177 key: test_accuracy value: [0.76470588 0.76470588 0.82352941 0.52941176 0.64705882 0.64705882 0.52941176 0.52941176 0.70588235 0.76470588] mean value: 0.6705882352941176 key: train_accuracy value: [0.73856209 0.69934641 0.71895425 0.7124183 0.7254902 0.73856209 0.73856209 0.7124183 0.7124183 0.71895425] mean value: 0.7215686274509804 key: test_fscore value: [0.83333333 0.81818182 0.86956522 0.6 0.72727273 0.72727273 0.6 0.66666667 0.7826087 0.84615385] mean value: 0.7471055031924597 key: train_fscore value: [0.80392157 0.7745098 0.7902439 0.78431373 0.7961165 0.80392157 0.8 0.78 0.78431373 0.78606965] mean value: 0.790341045119155 key: test_precision value: [0.71428571 0.75 0.76923077 0.6 0.66666667 0.72727273 0.66666667 0.61538462 0.75 0.73333333] mean value: 0.6992840492840493 key: train_precision value: [0.75229358 0.72477064 0.73636364 0.73394495 0.73873874 0.74545455 0.75471698 0.73584906 0.72727273 0.73831776] mean value: 0.7387722616886769 key: test_recall value: [1. 0.9 1. 0.6 0.8 0.72727273 0.54545455 0.72727273 0.81818182 1. ] mean value: 0.8118181818181818 key: train_recall value: [0.86315789 0.83157895 0.85263158 0.84210526 0.86315789 0.87234043 0.85106383 0.82978723 0.85106383 0.84042553] mean value: 0.8497312430011198 key: test_roc_auc value: [0.71428571 0.73571429 0.78571429 0.51428571 0.61428571 0.61363636 0.52272727 0.4469697 0.65909091 0.66666667] mean value: 0.6273376623376623 key: train_roc_auc value: [0.69882033 0.65716878 0.67631579 0.67105263 0.68157895 0.69888208 0.70519293 0.67760548 0.67129463 0.68292463] mean value: 0.682083622669467 key: test_jcc value: [0.71428571 0.69230769 0.76923077 0.42857143 0.57142857 0.57142857 0.42857143 0.5 0.64285714 0.73333333] mean value: 0.6052014652014652 key: train_jcc value: [0.67213115 0.632 0.65322581 0.64516129 0.66129032 0.67213115 0.66666667 0.63934426 0.64516129 0.64754098] mean value: 0.6534652917327692 MCC on Blind test: -0.07 Accuracy on Blind test: 0.53 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01129508 0.01568604 0.01332641 0.03448272 0.01468325 0.01636481 0.03756166 0.05245757 0.02793503 0.01410508] mean value: 0.02378976345062256 key: score_time value: [0.00920486 0.01116633 0.01118207 0.02200365 0.01170421 0.01170301 0.0180769 0.02105355 0.01168871 0.01157594] mean value: 0.01393592357635498 key: test_mcc value: [0.51428571 0.55328334 0.63262663 0.38122129 0.29880715 0.88273483 0.49441323 0.33371191 0.47673129 0.30389487] mean value: 0.4871710250874902 key: train_mcc value: [0.90340823 0.86531409 0.86241574 0.83628052 0.65815792 0.95883964 0.51726562 0.87413232 0.61903367 0.73827438] mean value: 0.783312212337401 key: test_accuracy value: [0.76470588 0.76470588 0.82352941 0.70588235 0.64705882 0.94117647 0.76470588 0.70588235 0.64705882 0.70588235] mean value: 0.7470588235294118 key: train_accuracy value: [0.95424837 0.93464052 0.93464052 0.92156863 0.83006536 0.98039216 0.75816993 0.93464052 0.76470588 0.86928105] mean value: 0.888235294117647 key: test_fscore value: [0.8 0.83333333 0.85714286 0.7826087 0.76923077 0.95238095 0.84615385 0.7826087 0.625 0.8 ] mean value: 0.8048459149546106 key: train_fscore value: [0.96410256 0.95 0.94680851 0.94 0.87962963 0.98395722 0.83555556 0.94382022 0.76315789 0.90384615] mean value: 0.9110877752479482 key: test_precision value: [0.8 0.71428571 0.81818182 0.69230769 0.625 1. 0.73333333 0.75 1. 0.71428571] mean value: 0.7847394272394272 key: train_precision value: [0.94 0.9047619 0.95698925 0.8952381 0.78512397 0.98924731 0.71755725 1. 1. 0.8245614 ] mean value: 0.9013479181499102 key: test_recall value: [0.8 1. 0.9 0.9 1. 0.90909091 1. 0.81818182 0.45454545 0.90909091] mean value: 0.8690909090909091 key: train_recall value: [0.98947368 1. 0.93684211 0.98947368 1. 0.9787234 1. 0.89361702 0.61702128 1. ] mean value: 0.940515117581187 key: test_roc_auc value: [0.75714286 0.71428571 0.80714286 0.66428571 0.57142857 0.95454545 0.66666667 0.65909091 0.72727273 0.62121212] mean value: 0.7143073593073593 key: train_roc_auc value: [0.9430127 0.9137931 0.93393829 0.89990926 0.77586207 0.98088713 0.68644068 0.94680851 0.80851064 0.83050847] mean value: 0.8719670853832294 key: test_jcc value: [0.66666667 0.71428571 0.75 0.64285714 0.625 0.90909091 0.73333333 0.64285714 0.45454545 0.66666667] mean value: 0.680530303030303 key: train_jcc value: [0.93069307 0.9047619 0.8989899 0.88679245 0.78512397 0.96842105 0.71755725 0.89361702 0.61702128 0.8245614 ] mean value: 0.842753929875216 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01497602 0.02522302 0.03343582 0.03448534 0.02283978 0.03451777 0.03477836 0.03254557 0.01978588 0.01348734] mean value: 0.026607489585876463 key: score_time value: [0.01177239 0.01971555 0.02048707 0.01987171 0.02130342 0.01973534 0.03330612 0.02216315 0.02278042 0.01166964] mean value: 0.02028048038482666 key: test_mcc value: [0.36780618 0.66299354 0.36780618 0.38122129 0.06546537 0.88273483 0.26967994 0.2030906 0.4608824 0.3385016 ] mean value: 0.4000181931146463 key: train_mcc value: [0.68055705 0.55202478 0.80732775 0.84960093 0.48191696 0.83778301 0.74694017 0.71803726 0.78917952 0.36822985] mean value: 0.6831597278235293 key: test_accuracy value: [0.64705882 0.82352941 0.64705882 0.70588235 0.58823529 0.94117647 0.47058824 0.58823529 0.76470588 0.70588235] mean value: 0.6882352941176471 key: train_accuracy value: [0.81045752 0.77777778 0.89542484 0.92810458 0.74509804 0.92156863 0.85620915 0.83660131 0.89542484 0.69281046] mean value: 0.8359477124183007 key: test_fscore value: [0.625 0.86956522 0.625 0.7826087 0.72 0.95238095 0.30769231 0.63157895 0.83333333 0.81481481] mean value: 0.7161974268633308 key: train_fscore value: [0.81987578 0.84821429 0.90804598 0.94472362 0.82969432 0.93478261 0.86746988 0.84662577 0.92156863 0.8 ] mean value: 0.8721000862893723 key: test_precision value: [0.83333333 0.76923077 0.83333333 0.69230769 0.6 1. 1. 0.75 0.76923077 0.6875 ] mean value: 0.7934935897435897 key: train_precision value: [1. 0.73643411 1. 0.90384615 0.70895522 0.95555556 1. 1. 0.85454545 0.66666667] mean value: 0.882600316302156 key: test_recall value: [0.5 1. 0.5 0.9 0.9 0.90909091 0.18181818 0.54545455 0.90909091 1. ] mean value: 0.7345454545454545 key: train_recall value: [0.69473684 1. 0.83157895 0.98947368 1. 0.91489362 0.76595745 0.73404255 1. 1. ] mean value: 0.8930683090705487 key: test_roc_auc value: [0.67857143 0.78571429 0.67857143 0.66428571 0.52142857 0.95454545 0.59090909 0.60606061 0.70454545 0.58333333] mean value: 0.6767965367965368 key: train_roc_auc value: [0.84736842 0.70689655 0.91578947 0.90852995 0.6637931 0.9235485 0.88297872 0.86702128 0.86440678 0.60169492] mean value: 0.8182027693803942 key: test_jcc value: [0.45454545 0.76923077 0.45454545 0.64285714 0.5625 0.90909091 0.18181818 0.46153846 0.71428571 0.6875 ] mean value: 0.5837912087912088 key: train_jcc value: [0.69473684 0.73643411 0.83157895 0.8952381 0.70895522 0.87755102 0.76595745 0.73404255 0.85454545 0.66666667] mean value: 0.7765706358739792 MCC on Blind test: 0.22 Accuracy on Blind test: 0.47 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.13799787 0.19559717 0.13715434 0.13844132 0.13842058 0.20680475 0.13890123 0.13737702 0.13736558 0.13849258] mean value: 0.15065524578094483 key: score_time value: [0.02109694 0.02060199 0.02058935 0.02053523 0.02055168 0.02046013 0.02037907 0.02750397 0.02053142 0.02038026] mean value: 0.0212630033493042 key: test_mcc value: [0.50920105 0.66299354 0.7 0.63262663 0.54935027 0.53673944 1. 0.63262663 0.60385964 1. ] mean value: 0.6827397199255986 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.76470588 0.82352941 0.82352941 0.82352941 0.76470588 0.76470588 1. 0.82352941 0.82352941 1. ] mean value: 0.8411764705882353 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.81818182 0.86956522 0.82352941 0.85714286 0.77777778 0.8 1. 0.85714286 0.86956522 1. ] mean value: 0.8672905156792625 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.75 0.76923077 1. 0.81818182 0.875 0.88888889 1. 0.9 0.83333333 1. ] mean value: 0.8834634809634809 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 0.7 0.9 0.7 0.72727273 1. 0.81818182 0.90909091 1. ] mean value: 0.8654545454545455 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.73571429 0.78571429 0.85 0.80714286 0.77857143 0.78030303 1. 0.82575758 0.78787879 1. ] mean value: 0.8351082251082251 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.69230769 0.76923077 0.7 0.75 0.63636364 0.66666667 1. 0.75 0.76923077 1. ] mean value: 0.7733799533799534 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.08164883 0.06786728 0.0824008 0.08094072 0.07851148 0.07243514 0.07851815 0.07457829 0.06393242 0.07801318] mean value: 0.07588462829589844 key: score_time value: [0.02787757 0.02429485 0.02658248 0.0281651 0.02558804 0.02345562 0.02348995 0.02183986 0.02336168 0.02322936] mean value: 0.024788451194763184 key: test_mcc value: [0.66299354 0.77151675 0.78881064 0.63262663 0.75714286 0.87400737 1. 0.78334945 0.74242424 0.88273483] mean value: 0.7895606313848224 key: train_mcc value: [0.98625704 1. 0.95857961 1. 0.9722323 1. 0.97241255 0.98625704 0.95857961 0.98625704] mean value: 0.9820575189370783 key: test_accuracy value: [0.82352941 0.88235294 0.88235294 0.82352941 0.88235294 0.94117647 1. 0.88235294 0.88235294 0.94117647] mean value: 0.8941176470588235 key: train_accuracy value: [0.99346405 1. 0.98039216 1. 0.9869281 1. 0.9869281 0.99346405 0.98039216 0.99346405] mean value: 0.9915032679738562 key: test_fscore value: [0.86956522 0.90909091 0.88888889 0.85714286 0.9 0.95652174 1. 0.9 0.90909091 0.95238095] mean value: 0.9142681473116256 key: train_fscore value: [0.99470899 1. 0.98412698 1. 0.98947368 1. 0.9893617 0.99470899 0.98412698 0.99470899] mean value: 0.9931216338719138 key: test_precision value: [0.76923077 0.83333333 1. 0.81818182 0.9 0.91666667 1. 1. 0.90909091 1. ] mean value: 0.9146503496503496 key: train_precision value: [1. 1. 0.9893617 1. 0.98947368 1. 0.9893617 0.98947368 0.97894737 0.98947368] mean value: 0.9926091825307951 key: test_recall value: [1. 1. 0.8 0.9 0.9 1. 1. 0.81818182 0.90909091 0.90909091] mean value: 0.9236363636363636 key: train_recall value: [0.98947368 1. 0.97894737 1. 0.98947368 1. 0.9893617 1. 0.9893617 1. ] mean value: 0.9936618141097424 key: test_roc_auc value: [0.78571429 0.85714286 0.9 0.80714286 0.87857143 0.91666667 1. 0.90909091 0.87121212 0.95454545] mean value: 0.888008658008658 key: train_roc_auc value: [0.99473684 1. 0.98085299 1. 0.98611615 1. 0.98620627 0.99152542 0.9777317 0.99152542] mean value: 0.9908694809882435 key: test_jcc value: [0.76923077 0.83333333 0.8 0.75 0.81818182 0.91666667 1. 0.81818182 0.83333333 0.90909091] mean value: 0.8448018648018648 key: train_jcc value: [0.98947368 1. 0.96875 1. 0.97916667 1. 0.97894737 0.98947368 0.96875 0.98947368] mean value: 0.9864035087719298 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.08835268 0.09343934 0.08267331 0.09665108 0.07296014 0.09550357 0.10921788 0.09168172 0.07537723 0.08310604] mean value: 0.0888962984085083 key: score_time value: [0.03113747 0.02139211 0.03570366 0.02917075 0.02335262 0.03179145 0.03920627 0.03474498 0.03395534 0.02556109] mean value: 0.03060157299041748 key: test_mcc value: [ 0.13241022 0.38251843 0.23975611 -0.27774603 -0.18232322 0.33371191 0.04351941 0.11236664 0.4608824 0.33371191] mean value: 0.1578807778938489 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.58823529 0.70588235 0.64705882 0.41176471 0.47058824 0.70588235 0.52941176 0.64705882 0.76470588 0.70588235] mean value: 0.6176470588235294 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.76190476 0.75 0.54545455 0.60869565 0.7826087 0.6 0.76923077 0.83333333 0.7826087 ] mean value: 0.7100503120068337 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.63636364 0.72727273 0.64285714 0.5 0.53846154 0.75 0.66666667 0.66666667 0.76923077 0.75 ] mean value: 0.6647519147519148 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.7 0.8 0.9 0.6 0.7 0.81818182 0.54545455 0.90909091 0.90909091 0.81818182] mean value: 0.77 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.56428571 0.68571429 0.59285714 0.37142857 0.42142857 0.65909091 0.52272727 0.53787879 0.70454545 0.65909091] mean value: 0.5719047619047619 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.61538462 0.6 0.375 0.4375 0.64285714 0.42857143 0.625 0.71428571 0.64285714] mean value: 0.5581456043956045 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.0 Accuracy on Blind test: 0.53 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.53434014 0.47350907 0.3815167 0.38136268 0.38167262 0.37864709 0.38625383 0.3857789 0.41710877 0.39069295] mean value: 0.4110882759094238 key: score_time value: [0.0129323 0.01272464 0.01275206 0.01263762 0.01286578 0.01278806 0.0126791 0.01264119 0.01395893 0.01256704] mean value: 0.012854671478271485 key: test_mcc value: [0.77151675 0.77151675 0.88741197 0.63262663 0.75714286 0.87400737 1. 0.78334945 0.87400737 1. ] mean value: 0.8351579150791348 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.88235294 0.88235294 0.94117647 0.82352941 0.88235294 0.94117647 1. 0.88235294 0.94117647 1. ] mean value: 0.9176470588235294 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.90909091 0.90909091 0.94736842 0.85714286 0.9 0.95652174 1. 0.9 0.95652174 1. ] mean value: 0.9335736574638177 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.83333333 0.83333333 1. 0.81818182 0.9 0.91666667 1. 1. 0.91666667 1. ] mean value: 0.9218181818181819 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.9 0.9 0.9 1. 1. 0.81818182 1. 1. ] mean value: 0.9518181818181818 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85714286 0.85714286 0.95 0.80714286 0.87857143 0.91666667 1. 0.90909091 0.91666667 1. ] mean value: 0.9092424242424243 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.83333333 0.83333333 0.9 0.75 0.81818182 0.91666667 1. 0.81818182 0.91666667 1. ] mean value: 0.8786363636363637 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.02650714 0.05811 0.05595326 0.04492354 0.07738733 0.04849219 0.03797841 0.04312611 0.06586456 0.07221007] mean value: 0.05305526256561279 key: score_time value: [0.01903915 0.03109527 0.02601171 0.02103806 0.02057695 0.01910973 0.02088594 0.0189662 0.02482319 0.02623796] mean value: 0.022778415679931642 key: test_mcc value: [ 0.13241022 0.38122129 -0.30550505 0.23975611 0.38122129 0.30389487 0.3385016 -0.01899343 0.06356417 0.30389487] mean value: 0.18199659502726337 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.58823529 0.70588235 0.47058824 0.64705882 0.70588235 0.70588235 0.70588235 0.58823529 0.58823529 0.70588235] mean value: 0.6411764705882353 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.7826087 0.64 0.75 0.7826087 0.8 0.81481481 0.72 0.69565217 0.8 ] mean value: 0.7452351046698873 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.63636364 0.69230769 0.53333333 0.64285714 0.69230769 0.71428571 0.6875 0.64285714 0.66666667 0.71428571] mean value: 0.6622764735264736 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.7 0.9 0.8 0.9 0.9 0.90909091 1. 0.81818182 0.72727273 0.90909091] mean value: 0.8563636363636363 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.56428571 0.66428571 0.4 0.59285714 0.66428571 0.62121212 0.58333333 0.49242424 0.53030303 0.62121212] mean value: 0.5734199134199134 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.64285714 0.47058824 0.6 0.64285714 0.66666667 0.6875 0.5625 0.53333333 0.66666667] mean value: 0.597296918767507 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.27 Accuracy on Blind test: 0.67 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.08099842 0.08981967 0.04990792 0.05527115 0.05714655 0.05031085 0.05187988 0.05764365 0.05021954 0.05180001] mean value: 0.05949976444244385 key: score_time value: [0.0333674 0.02936316 0.02458978 0.03190279 0.02809477 0.03452396 0.02786756 0.02454042 0.02761149 0.02961063] mean value: 0.02914719581604004 key: test_mcc value: [0.51428571 0.66299354 0.27142857 0.63262663 0.63262663 0.88273483 0.38251843 0.33371191 0.74242424 0.33371191] mean value: 0.5389062395989187 key: train_mcc value: [0.9306986 0.9587737 0.91649194 0.90340823 0.93172069 0.90411865 0.94559731 0.95906064 0.91761348 0.90330977] mean value: 0.9270793013098675 key: test_accuracy value: [0.76470588 0.82352941 0.64705882 0.82352941 0.82352941 0.94117647 0.70588235 0.70588235 0.88235294 0.70588235] mean value: 0.7823529411764706 key: train_accuracy value: [0.96732026 0.98039216 0.96078431 0.95424837 0.96732026 0.95424837 0.97385621 0.98039216 0.96078431 0.95424837] mean value: 0.965359477124183 key: test_fscore value: [0.8 0.86956522 0.7 0.85714286 0.85714286 0.95238095 0.76190476 0.7826087 0.90909091 0.7826087 ] mean value: 0.827244494635799 key: train_fscore value: [0.97409326 0.98445596 0.96875 0.96410256 0.97435897 0.96373057 0.97916667 0.98429319 0.96875 0.96335079] mean value: 0.9725051976931911 key: test_precision value: [0.8 0.76923077 0.7 0.81818182 0.81818182 1. 0.8 0.75 0.90909091 0.75 ] mean value: 0.8114685314685315 key: train_precision value: [0.95918367 0.96938776 0.95876289 0.94 0.95 0.93939394 0.95918367 0.96907216 0.94897959 0.94845361] mean value: 0.9542417293065305 key: test_recall value: [0.8 1. 0.7 0.9 0.9 0.90909091 0.72727273 0.81818182 0.90909091 0.81818182] mean value: 0.8481818181818181 key: train_recall value: [0.98947368 1. 0.97894737 0.98947368 1. 0.9893617 1. 1. 0.9893617 0.9787234 ] mean value: 0.9915341545352744 key: test_roc_auc value: [0.75714286 0.78571429 0.63571429 0.80714286 0.80714286 0.95454545 0.6969697 0.65909091 0.87121212 0.65909091] mean value: 0.7633766233766234 key: train_roc_auc value: [0.96025408 0.97413793 0.95499093 0.9430127 0.95689655 0.94383339 0.96610169 0.97457627 0.95230797 0.94698882] mean value: 0.9573100346025291 key: test_jcc value: [0.66666667 0.76923077 0.53846154 0.75 0.75 0.90909091 0.61538462 0.64285714 0.83333333 0.64285714] mean value: 0.7117882117882118 key: train_jcc value: [0.94949495 0.96938776 0.93939394 0.93069307 0.95 0.93 0.95918367 0.96907216 0.93939394 0.92929293] mean value: 0.946591242040257 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.411973 0.34421086 0.39803886 0.35081291 0.3494761 0.33145618 0.3288908 0.35393381 0.34595108 0.33966637] mean value: 0.3554409980773926 key: score_time value: [0.03101277 0.03190875 0.02901888 0.02524614 0.03131557 0.02620721 0.0313201 0.02892303 0.03440547 0.02825022] mean value: 0.02976081371307373 key: test_mcc value: [0.51428571 0.66299354 0.27142857 0.63262663 0.63262663 0.88273483 0.38251843 0.33371191 0.63262663 0.33371191] mean value: 0.5279264781376682 key: train_mcc value: [0.9306986 0.9587737 0.91649194 0.90340823 0.94445829 0.90411865 0.94559731 0.95906064 0.93118521 0.90330977] mean value: 0.929710233179641 key: test_accuracy value: [0.76470588 0.82352941 0.64705882 0.82352941 0.82352941 0.94117647 0.70588235 0.70588235 0.82352941 0.70588235] mean value: 0.7764705882352941 key: train_accuracy value: [0.96732026 0.98039216 0.96078431 0.95424837 0.97385621 0.95424837 0.97385621 0.98039216 0.96732026 0.95424837] mean value: 0.9666666666666667 key: test_fscore value: [0.8 0.86956522 0.7 0.85714286 0.85714286 0.95238095 0.76190476 0.7826087 0.85714286 0.7826087 ] mean value: 0.8220496894409938 key: train_fscore value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:107: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:110: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.97409326 0.98445596 0.96875 0.96410256 0.97916667 0.96373057 0.97916667 0.98429319 0.97382199 0.96335079] mean value: 0.9734931658768399 key: test_precision value: [0.8 0.76923077 0.7 0.81818182 0.81818182 1. 0.8 0.75 0.9 0.75 ] mean value: 0.8105594405594406 key: train_precision value: [0.95918367 0.96938776 0.95876289 0.94 0.96907216 0.93939394 0.95918367 0.96907216 0.95876289 0.94845361] mean value: 0.9571272752774962 key: test_recall value: [0.8 1. 0.7 0.9 0.9 0.90909091 0.72727273 0.81818182 0.81818182 0.81818182] mean value: 0.8390909090909091 key: train_recall value: [0.98947368 1. 0.97894737 0.98947368 0.98947368 0.9893617 1. 1. 0.9893617 0.9787234 ] mean value: 0.990481522956327 key: test_roc_auc value: [0.75714286 0.78571429 0.63571429 0.80714286 0.80714286 0.95454545 0.6969697 0.65909091 0.82575758 0.65909091] mean value: 0.7588311688311689 key: train_roc_auc value: [0.96025408 0.97413793 0.95499093 0.9430127 0.96887477 0.94383339 0.96610169 0.97457627 0.96078255 0.94698882] mean value: 0.9593553143712085 key: test_jcc value: [0.66666667 0.76923077 0.53846154 0.75 0.75 0.90909091 0.61538462 0.64285714 0.75 0.64285714] mean value: 0.7034548784548784 key: train_jcc value: [0.94949495 0.96938776 0.93939394 0.93069307 0.95918367 0.93 0.95918367 0.96907216 0.94897959 0.92929293] mean value: 0.9484681746314754 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.04003 0.03358912 0.06656361 0.0678463 0.10102797 0.07133985 0.07167172 0.07186866 0.08310914 0.07368183] mean value: 0.06807281970977783 key: score_time value: [0.01230335 0.0230813 0.02241945 0.02380037 0.02342224 0.01903343 0.03061485 0.0213964 0.02022529 0.01746082] mean value: 0.021375751495361327 key: test_mcc value: [0.52295779 0.71562645 0.4719399 0.71562645 0.62641448 0.82572282 0.44038551 0.52295779 0.71818182 0.42727273] mean value: 0.5987085733914107 key: train_mcc value: [0.8518477 0.80951848 0.83085028 0.79896965 0.83088812 0.84132139 0.8518477 0.83068309 0.84166312 0.89500244] mean value: 0.8382591983762472 key: test_accuracy value: [0.76190476 0.85714286 0.71428571 0.85714286 0.80952381 0.9047619 0.71428571 0.76190476 0.85714286 0.71428571] mean value: 0.7952380952380952 key: train_accuracy value: [0.92592593 0.9047619 0.91534392 0.8994709 0.91534392 0.92063492 0.92592593 0.91534392 0.92063492 0.94708995] mean value: 0.919047619047619 key: test_fscore value: [0.73684211 0.84210526 0.75 0.84210526 0.77777778 0.9 0.7 0.7826087 0.85714286 0.72727273] mean value: 0.7915854689424483 key: train_fscore value: [0.92631579 0.90526316 0.91666667 0.90052356 0.91489362 0.92063492 0.92553191 0.91489362 0.91891892 0.94791667] mean value: 0.9191558829401187 key: test_precision value: [0.77777778 0.88888889 0.64285714 0.88888889 0.875 1. 0.77777778 0.75 0.9 0.72727273] mean value: 0.8228463203463203 key: train_precision value: [0.92631579 0.90526316 0.90721649 0.89583333 0.92473118 0.91578947 0.92553191 0.91489362 0.93406593 0.92857143] mean value: 0.917821232657928 key: test_recall value: [0.7 0.8 0.9 0.8 0.7 0.81818182 0.63636364 0.81818182 0.81818182 0.72727273] mean value: 0.7718181818181818 key: train_recall value: [0.92631579 0.90526316 0.92631579 0.90526316 0.90526316 0.92553191 0.92553191 0.91489362 0.90425532 0.96808511] mean value: 0.9206718924972005 key: test_roc_auc value: [0.75909091 0.85454545 0.72272727 0.85454545 0.80454545 0.90909091 0.71818182 0.75909091 0.85909091 0.71363636] mean value: 0.7954545454545454 key: train_roc_auc value: [0.92592385 0.90475924 0.91528555 0.89944009 0.91539754 0.92066069 0.92592385 0.91534155 0.92054871 0.94720045] mean value: 0.9190481522956326 key: test_jcc value: [0.58333333 0.72727273 0.6 0.72727273 0.63636364 0.81818182 0.53846154 0.64285714 0.75 0.57142857] mean value: 0.6595171495171496 key: train_jcc value: [0.8627451 0.82692308 0.84615385 0.81904762 0.84313725 0.85294118 0.86138614 0.84313725 0.85 0.9009901 ] mean value: 0.850646156406203 MCC on Blind test: 0.43 Accuracy on Blind test: 0.73 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [1.61443114 0.991431 1.31764698 1.33511066 1.55657887 1.8804636 2.35004854 1.59028721 1.81239128 2.00124836] mean value: 1.644963765144348 key: score_time value: [0.03566933 0.01466346 0.01258254 0.01850414 0.02412224 0.02417207 0.03879929 0.02366471 0.02081037 0.02138114] mean value: 0.023436927795410158 key: test_mcc value: [0.43007562 0.61818182 0.55161872 1. 0.80909091 0.82572282 0.55161872 0.62641448 0.90909091 0.55161872] mean value: 0.6873432731232932 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 0.91555606 1. 1. ] mean value: 0.9915556059051258 key: test_accuracy value: [0.71428571 0.80952381 0.76190476 1. 0.9047619 0.9047619 0.76190476 0.80952381 0.95238095 0.76190476] mean value: 0.8380952380952381 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 0.95767196 1. 1. ] mean value: 0.9957671957671957 key: test_fscore value: [0.66666667 0.8 0.7826087 1. 0.9 0.9 0.73684211 0.83333333 0.95238095 0.73684211] mean value: 0.8308673858559442 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 0.95789474 1. 1. ] mean value: 0.9957894736842106 key: test_precision value: [0.75 0.8 0.69230769 1. 0.9 1. 0.875 0.76923077 1. 0.875 ] mean value: 0.8661538461538462 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 0.94791667 1. 1. ] mean value: 0.9947916666666666 key: test_recall value: [0.6 0.8 0.9 1. 0.9 0.81818182 0.63636364 0.90909091 0.90909091 0.63636364] mean value: 0.8109090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 0.96808511 1. 1. ] mean value: 0.9968085106382979 key: test_roc_auc value: [0.70909091 0.80909091 0.76818182 1. 0.90454545 0.90909091 0.76818182 0.80454545 0.95454545 0.76818182] mean value: 0.8395454545454546 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 0.95772676 1. 1. ] mean value: 0.9957726763717805 key: test_jcc value: [0.5 0.66666667 0.64285714 1. 0.81818182 0.81818182 0.58333333 0.71428571 0.90909091 0.58333333] mean value: 0.7235930735930736 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 0.91919192 1. 1. ] mean value: 0.9919191919191919 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.03653574 0.01316571 0.01315188 0.0131731 0.01319766 0.01305985 0.01309562 0.01336765 0.01304221 0.01303864] mean value: 0.015482807159423828 key: score_time value: [0.01301122 0.01235342 0.01248431 0.01237154 0.01234913 0.01239204 0.01234293 0.01240659 0.01235342 0.01222396] mean value: 0.012428855895996094 key: test_mcc value: [ 0.44038551 -0.13762047 0.39196475 0.33709993 0.52727273 0.71562645 0.14545455 0.33709993 0.45226702 0.52295779] mean value: 0.37325081711851577 key: train_mcc value: [0.55158352 0.46765481 0.54179779 0.46109894 0.53609614 0.47825095 0.44012799 0.55442155 0.4230863 0.52563909] mean value: 0.4979757093671649 key: test_accuracy value: [0.71428571 0.42857143 0.66666667 0.66666667 0.76190476 0.85714286 0.57142857 0.66666667 0.71428571 0.76190476] mean value: 0.680952380952381 key: train_accuracy value: [0.77248677 0.71957672 0.76719577 0.73015873 0.76719577 0.73544974 0.71957672 0.77248677 0.69312169 0.75661376] mean value: 0.7433862433862434 key: test_fscore value: [0.72727273 0.45454545 0.72 0.58823529 0.76190476 0.86956522 0.57142857 0.72 0.76923077 0.7826087 ] mean value: 0.696479149154341 key: train_fscore value: [0.7902439 0.76233184 0.78640777 0.72432432 0.77777778 0.75490196 0.70718232 0.7902439 0.74336283 0.77884615] mean value: 0.7615622779466328 key: test_precision value: [0.66666667 0.41666667 0.6 0.71428571 0.72727273 0.83333333 0.6 0.64285714 0.66666667 0.75 ] mean value: 0.6617748917748918 key: train_precision value: [0.73636364 0.6640625 0.72972973 0.74444444 0.74757282 0.7 0.73563218 0.72972973 0.63636364 0.71052632] mean value: 0.7134424991862677 key: test_recall value: [0.8 0.5 0.9 0.5 0.8 0.90909091 0.54545455 0.81818182 0.90909091 0.81818182] mean value: 0.75 key: train_recall value: [0.85263158 0.89473684 0.85263158 0.70526316 0.81052632 0.81914894 0.68085106 0.86170213 0.89361702 0.86170213] mean value: 0.8232810750279955 key: test_roc_auc value: [0.71818182 0.43181818 0.67727273 0.65909091 0.76363636 0.85454545 0.57272727 0.65909091 0.70454545 0.75909091] mean value: 0.68 key: train_roc_auc value: [0.77206047 0.71864502 0.76674132 0.73029115 0.76696529 0.73589026 0.7193729 0.77295633 0.69417693 0.75716685] mean value: 0.7434266517357223 key: test_jcc value: [0.57142857 0.29411765 0.5625 0.41666667 0.61538462 0.76923077 0.4 0.5625 0.625 0.64285714] mean value: 0.5459685412626589 key: train_jcc value: [0.65322581 0.61594203 0.648 0.56779661 0.63636364 0.60629921 0.54700855 0.65322581 0.5915493 0.63779528] mean value: 0.6157206219394032 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01332068 0.01345658 0.01338673 0.01330256 0.01338434 0.01339674 0.01414132 0.01341271 0.01346302 0.01340342] mean value: 0.013466811180114746 key: score_time value: [0.0123136 0.01233149 0.01238203 0.01235437 0.01238656 0.01232052 0.02520466 0.01238585 0.01245117 0.0123682 ] mean value: 0.013649845123291015 key: test_mcc value: [0.23373675 0.42817442 0.14545455 0.45226702 0.42817442 0.55161872 0.06741999 0.23636364 0.55161872 0.60302269] mean value: 0.36978509083764477 key: train_mcc value: [0.55646909 0.49995455 0.56569532 0.45906255 0.53448943 0.46424351 0.54251375 0.55585218 0.49572783 0.4861571 ] mean value: 0.5160165310554382 key: test_accuracy value: [0.61904762 0.66666667 0.57142857 0.71428571 0.66666667 0.76190476 0.52380952 0.61904762 0.76190476 0.76190476] mean value: 0.6666666666666666 key: train_accuracy value: [0.76719577 0.74074074 0.77248677 0.71957672 0.75661376 0.72486772 0.76190476 0.77248677 0.73544974 0.73544974] mean value: 0.7486772486772486 key: test_fscore value: [0.55555556 0.46153846 0.57142857 0.625 0.46153846 0.73684211 0.44444444 0.63636364 0.73684211 0.70588235] mean value: 0.5935435694336623 key: train_fscore value: [0.73170732 0.7030303 0.73939394 0.67484663 0.7195122 0.68292683 0.72392638 0.74556213 0.6835443 0.69512195] mean value: 0.7099571975217122 key: test_precision value: [0.625 1. 0.54545455 0.83333333 1. 0.875 0.57142857 0.63636364 0.875 1. ] mean value: 0.7961580086580087 key: train_precision value: [0.86956522 0.82857143 0.87142857 0.80882353 0.85507246 0.8 0.85507246 0.84 0.84375 0.81428571] mean value: 0.8386569388625016 key: test_recall value: [0.5 0.3 0.6 0.5 0.3 0.63636364 0.36363636 0.63636364 0.63636364 0.54545455] mean value: 0.5018181818181818 key: train_recall value: [0.63157895 0.61052632 0.64210526 0.57894737 0.62105263 0.59574468 0.62765957 0.67021277 0.57446809 0.60638298] mean value: 0.6158678611422173 key: test_roc_auc value: [0.61363636 0.65 0.57272727 0.70454545 0.65 0.76818182 0.53181818 0.61818182 0.76818182 0.77272727] mean value: 0.665 key: train_roc_auc value: [0.76791713 0.74143337 0.77318029 0.72032475 0.75733483 0.72418813 0.76119821 0.77194849 0.73460246 0.73477044] mean value: 0.7486898096304592 key: test_jcc value: [0.38461538 0.3 0.4 0.45454545 0.3 0.58333333 0.28571429 0.46666667 0.58333333 0.54545455] mean value: 0.43036630036630036 key: train_jcc value: [0.57692308 0.54205607 0.58653846 0.50925926 0.56190476 0.51851852 0.56730769 0.59433962 0.51923077 0.53271028] mean value: 0.5508788517464236 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01281309 0.01259708 0.01266217 0.0125277 0.01282144 0.01259828 0.01269078 0.01281691 0.01241755 0.0138123 ] mean value: 0.012775731086730958 key: score_time value: [0.02045655 0.03969097 0.05234361 0.0366714 0.03734803 0.03721189 0.03660083 0.03660321 0.0360918 0.05223918] mean value: 0.038525748252868655 key: test_mcc value: [-0.06741999 0.23373675 0.03015113 0.33636364 0.13858047 0.14545455 -0.26593594 0.42727273 0.33636364 0.18090681] mean value: 0.14954737717597608 key: train_mcc value: [0.57375166 0.50637592 0.61142844 0.55051844 0.49793339 0.55594205 0.6099783 0.5498651 0.55978224 0.55189788] mean value: 0.5567473416942506 key: test_accuracy value: [0.47619048 0.61904762 0.52380952 0.66666667 0.57142857 0.57142857 0.38095238 0.71428571 0.66666667 0.57142857] mean value: 0.5761904761904761 key: train_accuracy value: [0.78306878 0.75132275 0.8042328 0.77248677 0.74603175 0.77777778 0.8042328 0.77248677 0.77777778 0.76719577] mean value: 0.7756613756613756 key: test_fscore value: [0.35294118 0.55555556 0.375 0.66666667 0.4 0.57142857 0.13333333 0.72727273 0.66666667 0.47058824] mean value: 0.49194529326882264 key: train_fscore value: [0.76571429 0.73743017 0.79558011 0.75706215 0.72727273 0.77173913 0.79558011 0.75428571 0.76136364 0.73170732] mean value: 0.7597735346629213 key: test_precision value: [0.42857143 0.625 0.5 0.63636364 0.6 0.6 0.25 0.72727273 0.7 0.66666667] mean value: 0.5733874458874458 key: train_precision value: [0.8375 0.78571429 0.8372093 0.81707317 0.79012346 0.78888889 0.82758621 0.81481481 0.81707317 0.85714286] mean value: 0.8173126154036517 key: test_recall value: [0.3 0.5 0.3 0.7 0.3 0.54545455 0.09090909 0.72727273 0.63636364 0.36363636] mean value: 0.44636363636363635 key: train_recall value: [0.70526316 0.69473684 0.75789474 0.70526316 0.67368421 0.75531915 0.76595745 0.70212766 0.71276596 0.63829787] mean value: 0.7111310190369541 key: test_roc_auc value: [0.46818182 0.61363636 0.51363636 0.66818182 0.55909091 0.57272727 0.39545455 0.71363636 0.66818182 0.58181818] mean value: 0.5754545454545454 key: train_roc_auc value: [0.78348264 0.75162374 0.80447928 0.77284434 0.74641657 0.77765957 0.80403135 0.77211646 0.77743561 0.76651736] mean value: 0.7756606942889137 key: test_jcc value: [0.21428571 0.38461538 0.23076923 0.5 0.25 0.4 0.07142857 0.57142857 0.5 0.30769231] mean value: 0.34302197802197804 key: train_jcc value: [0.62037037 0.5840708 0.66055046 0.60909091 0.57142857 0.62831858 0.66055046 0.60550459 0.6146789 0.57692308] mean value: 0.6131486712013626 MCC on Blind test: 0.05 Accuracy on Blind test: 0.53 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01701498 0.01680541 0.01711583 0.01694584 0.01700974 0.01702642 0.01688266 0.0170269 0.01668429 0.01705098] mean value: 0.016956305503845213 key: score_time value: [0.01340604 0.01336288 0.01338744 0.01337409 0.01337981 0.01351428 0.01340461 0.01340222 0.01311612 0.01343751] mean value: 0.013378500938415527 key: test_mcc value: [0.13762047 0.24120908 0.44038551 0.23373675 0.36244122 0.71818182 0.14545455 0.42727273 0.61818182 0.52727273] mean value: 0.38517566540238385 key: train_mcc value: [0.73654755 0.68655917 0.75666293 0.73549832 0.69804157 0.73867014 0.75694773 0.81109216 0.75994222 0.76164115] mean value: 0.7441602942290075 key: test_accuracy value: [0.57142857 0.61904762 0.71428571 0.61904762 0.66666667 0.85714286 0.57142857 0.71428571 0.80952381 0.76190476] mean value: 0.6904761904761905 key: train_accuracy value: [0.86772487 0.84126984 0.87830688 0.86772487 0.84656085 0.86772487 0.87830688 0.9047619 0.87830688 0.87830688] mean value: 0.8708994708994708 key: test_fscore value: [0.52631579 0.5 0.72727273 0.55555556 0.53333333 0.85714286 0.57142857 0.72727273 0.81818182 0.76190476] mean value: 0.6578408141566037 key: train_fscore value: [0.86486486 0.83333333 0.87830688 0.86772487 0.83798883 0.8603352 0.87567568 0.9010989 0.87150838 0.8700565 ] mean value: 0.86608934204143 key: test_precision value: [0.55555556 0.66666667 0.66666667 0.625 0.8 0.9 0.6 0.72727273 0.81818182 0.8 ] mean value: 0.7159343434343435 key: train_precision value: [0.88888889 0.88235294 0.88297872 0.87234043 0.89285714 0.90588235 0.89010989 0.93181818 0.91764706 0.92771084] mean value: 0.8992586448924944 key: test_recall value: [0.5 0.4 0.8 0.5 0.4 0.81818182 0.54545455 0.72727273 0.81818182 0.72727273] mean value: 0.6236363636363637 key: train_recall value: [0.84210526 0.78947368 0.87368421 0.86315789 0.78947368 0.81914894 0.86170213 0.87234043 0.82978723 0.81914894] mean value: 0.8360022396416573 key: test_roc_auc value: [0.56818182 0.60909091 0.71818182 0.61363636 0.65454545 0.85909091 0.57272727 0.71363636 0.80909091 0.76363636] mean value: 0.6881818181818182 key: train_roc_auc value: [0.86786114 0.84154535 0.87833147 0.86774916 0.8468645 0.8674692 0.87821948 0.90459127 0.87805151 0.87799552] mean value: 0.8708678611422173 key: test_jcc value: [0.35714286 0.33333333 0.57142857 0.38461538 0.36363636 0.75 0.4 0.57142857 0.69230769 0.61538462] mean value: 0.5039277389277389 key: train_jcc value: [0.76190476 0.71428571 0.78301887 0.76635514 0.72115385 0.75490196 0.77884615 0.82 0.77227723 0.77 ] mean value: 0.7642743672809006 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.34039497 1.35265207 1.41161752 1.35997891 1.55491352 1.25403571 1.12641239 1.19176412 0.80105114 0.87126184] mean value: 1.2264082193374635 key: score_time value: [0.01979089 0.02390909 0.03533006 0.0225184 0.02177119 0.02182436 0.01981211 0.01836801 0.01252675 0.01538301] mean value: 0.021123385429382323 key: test_mcc value: [0.23636364 0.61818182 0.63305416 0.52727273 0.80909091 0.67419986 0.42727273 0.53935989 0.71818182 0.67419986] mean value: 0.5857177416208371 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.80952381 0.80952381 0.76190476 0.9047619 0.80952381 0.71428571 0.76190476 0.85714286 0.80952381] mean value: 0.7857142857142857 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.6 0.8 0.81818182 0.76190476 0.9 0.77777778 0.72727273 0.8 0.85714286 0.77777778] mean value: 0.782005772005772 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.6 0.8 0.75 0.72727273 0.9 1. 0.72727273 0.71428571 0.9 1. ] mean value: 0.8118831168831169 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.8 0.9 0.8 0.9 0.63636364 0.72727273 0.90909091 0.81818182 0.63636364] mean value: 0.7727272727272727 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61818182 0.80909091 0.81363636 0.76363636 0.90454545 0.81818182 0.71363636 0.75454545 0.85909091 0.81818182] mean value: 0.7872727272727272 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.42857143 0.66666667 0.69230769 0.61538462 0.81818182 0.63636364 0.57142857 0.66666667 0.75 0.63636364] mean value: 0.6481934731934732 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.01981091 0.0168035 0.01414537 0.01470256 0.01423645 0.01411271 0.01426768 0.01455116 0.01446676 0.01585817] mean value: 0.01529552936553955 key: score_time value: [0.01231694 0.0095036 0.00918722 0.00877905 0.00922728 0.00901771 0.00886106 0.00959349 0.00907612 0.00917816] mean value: 0.009474062919616699 key: test_mcc value: [0.80909091 1. 0.55161872 0.53935989 1. 0.90909091 0.80909091 0.71562645 0.90909091 0.90829511] mean value: 0.8151263804000601 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 1. 0.76190476 0.76190476 1. 0.95238095 0.9047619 0.85714286 0.95238095 0.95238095] mean value: 0.9047619047619048 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.9 1. 0.7826087 0.70588235 1. 0.95238095 0.90909091 0.86956522 0.95238095 0.95652174] mean value: 0.9028430818967903 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9 1. 0.69230769 0.85714286 1. 1. 0.90909091 0.83333333 1. 0.91666667] mean value: 0.9108541458541458 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 0.9 0.6 1. 0.90909091 0.90909091 0.90909091 0.90909091 1. ] mean value: 0.9036363636363636 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90454545 1. 0.76818182 0.75454545 1. 0.95454545 0.90454545 0.85454545 0.95454545 0.95 ] mean value: 0.9045454545454545 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.81818182 1. 0.64285714 0.54545455 1. 0.90909091 0.83333333 0.76923077 0.90909091 0.91666667] mean value: 0.8343906093906094 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.09894037 0.09839034 0.09680152 0.09823012 0.09769559 0.09740853 0.09786224 0.09949851 0.09909749 0.09787583] mean value: 0.09818005561828613 key: score_time value: [0.01770234 0.01870775 0.01915932 0.01832509 0.01789999 0.01840591 0.01811194 0.0182302 0.01793718 0.01793575] mean value: 0.01824154853820801 key: test_mcc value: [0.44038551 0.62641448 0.60302269 0.42727273 0.90829511 0.80909091 0.44038551 0.53935989 0.71818182 0.82572282] mean value: 0.6338131459152349 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.71428571 0.80952381 0.76190476 0.71428571 0.95238095 0.9047619 0.71428571 0.76190476 0.85714286 0.9047619 ] mean value: 0.8095238095238095 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.72727273 0.77777778 0.8 0.7 0.94736842 0.90909091 0.7 0.8 0.85714286 0.9 ] mean value: 0.8118652692336903 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.66666667 0.875 0.66666667 0.7 1. 0.90909091 0.77777778 0.71428571 0.9 1. ] mean value: 0.8209487734487735 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.7 1. 0.7 0.9 0.90909091 0.63636364 0.90909091 0.81818182 0.81818182] mean value: 0.8190909090909091 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.71818182 0.80454545 0.77272727 0.71363636 0.95 0.90454545 0.71818182 0.75454545 0.85909091 0.90909091] mean value: 0.8104545454545454 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.57142857 0.63636364 0.66666667 0.53846154 0.9 0.83333333 0.53846154 0.66666667 0.75 0.81818182] mean value: 0.6919563769563769 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.0100956 0.01037574 0.00979137 0.00928307 0.00931144 0.00919366 0.01059389 0.01200008 0.00972986 0.01077032] mean value: 0.010114502906799317 key: score_time value: [0.00959349 0.00903225 0.00895309 0.00887442 0.00888252 0.00879502 0.01142311 0.01174235 0.01009965 0.00888896] mean value: 0.009628486633300782 key: test_mcc value: [0.13762047 0.23373675 0.24771685 0.23636364 0.53935989 0.53935989 0.06741999 0.52295779 0.82572282 0.71818182] mean value: 0.40684398983090986 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.57142857 0.61904762 0.61904762 0.61904762 0.76190476 0.76190476 0.52380952 0.76190476 0.9047619 0.85714286] mean value: 0.7 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.52631579 0.55555556 0.63636364 0.6 0.70588235 0.8 0.44444444 0.7826087 0.9 0.85714286] mean value: 0.6808313331573528 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.625 0.58333333 0.6 0.85714286 0.71428571 0.57142857 0.75 1. 0.9 ] mean value: 0.7156746031746032 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.5 0.5 0.7 0.6 0.6 0.90909091 0.36363636 0.81818182 0.81818182 0.81818182] mean value: 0.6627272727272727 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.56818182 0.61363636 0.62272727 0.61818182 0.75454545 0.75454545 0.53181818 0.75909091 0.90909091 0.85909091] mean value: 0.6990909090909091 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.35714286 0.38461538 0.46666667 0.42857143 0.54545455 0.66666667 0.28571429 0.64285714 0.81818182 0.75 ] mean value: 0.5345870795870796 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: -0.11 Accuracy on Blind test: 0.47 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.31620884 1.31651926 1.31475592 1.30594683 1.28883219 1.36383224 1.30705667 1.3408339 1.34590888 1.31465459] mean value: 1.321454930305481 key: score_time value: [0.09206462 0.09813571 0.09290099 0.09021497 0.09371018 0.09852195 0.09075403 0.09710979 0.09322071 0.09802985] mean value: 0.09446628093719482 key: test_mcc value: [0.52295779 0.71562645 0.39196475 0.52295779 0.90909091 1. 0.71818182 0.71562645 1. 0.61818182] mean value: 0.7114587764939949 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.76190476 0.85714286 0.66666667 0.76190476 0.95238095 1. 0.85714286 0.85714286 1. 0.80952381] mean value: 0.8523809523809524 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.73684211 0.84210526 0.72 0.73684211 0.95238095 1. 0.85714286 0.86956522 1. 0.81818182] mean value: 0.8533060318781143 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.77777778 0.88888889 0.6 0.77777778 0.90909091 1. 0.9 0.83333333 1. 0.81818182] mean value: 0.8505050505050505 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.7 0.8 0.9 0.7 1. 1. 0.81818182 0.90909091 1. 0.81818182] mean value: 0.8645454545454545 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.75909091 0.85454545 0.67727273 0.75909091 0.95454545 1. 0.85909091 0.85454545 1. 0.80909091] mean value: 0.8527272727272728 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.58333333 0.72727273 0.5625 0.58333333 0.90909091 1. 0.75 0.76923077 1. 0.69230769] mean value: 0.7577068764568765 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.85998034 1.44642282 1.04460621 0.97317791 0.90930486 0.89995694 0.87365675 0.88928413 0.94170761 0.95756435] mean value: 0.9795661926269531 key: score_time value: [0.2091279 0.2227211 0.17214561 0.14424253 0.16997743 0.16527343 0.17143083 0.20080829 0.17865419 0.12289047] mean value: 0.17572717666625975 key: test_mcc value: [0.52295779 0.53935989 0.39196475 0.62641448 0.80909091 0.82275335 0.52727273 0.61818182 1. 0.71818182] mean value: 0.6576177533597088 key: train_mcc value: [0.95788064 0.95788064 0.96830553 0.95788064 0.95788064 0.94757483 0.97905701 0.93736014 0.95789003 0.95789003] mean value: 0.9579600124491399 key: test_accuracy value: [0.76190476 0.76190476 0.66666667 0.80952381 0.9047619 0.9047619 0.76190476 0.80952381 1. 0.85714286] mean value: 0.8238095238095238 key: train_accuracy value: [0.97883598 0.97883598 0.98412698 0.97883598 0.97883598 0.97354497 0.98941799 0.96825397 0.97883598 0.97883598] mean value: 0.9788359788359788 key: test_fscore value: [0.73684211 0.70588235 0.72 0.77777778 0.9 0.91666667 0.76190476 0.81818182 1. 0.85714286] mean value: 0.8194398339878216 key: train_fscore value: [0.97916667 0.97916667 0.98429319 0.97916667 0.97916667 0.97382199 0.98947368 0.96875 0.97894737 0.97894737] mean value: 0.9790900270965371 key: test_precision value: [0.77777778 0.85714286 0.6 0.875 0.9 0.84615385 0.8 0.81818182 1. 0.9 ] mean value: 0.83742562992563 key: train_precision value: [0.96907216 0.96907216 0.97916667 0.96907216 0.96907216 0.95876289 0.97916667 0.94897959 0.96875 0.96875 ] mean value: 0.9679864471561821 key: test_recall value: [0.7 0.6 0.9 0.7 0.9 1. 0.72727273 0.81818182 1. 0.81818182] mean value: 0.8163636363636364 key: train_recall value: [0.98947368 0.98947368 0.98947368 0.98947368 0.98947368 0.9893617 1. 0.9893617 0.9893617 0.9893617 ] mean value: 0.990481522956327 key: test_roc_auc value: [0.75909091 0.75454545 0.67727273 0.80454545 0.90454545 0.9 0.76363636 0.80909091 1. 0.85909091] mean value: 0.8231818181818182 key: train_roc_auc value: [0.9787794 0.9787794 0.98409854 0.9787794 0.9787794 0.97362822 0.98947368 0.96836506 0.97889138 0.97889138] mean value: 0.9788465845464726 key: test_jcc value: [0.58333333 0.54545455 0.5625 0.63636364 0.81818182 0.84615385 0.61538462 0.69230769 1. 0.75 ] mean value: 0.7049679487179488 key: train_jcc value: [0.95918367 0.95918367 0.96907216 0.95918367 0.95918367 0.94897959 0.97916667 0.93939394 0.95876289 0.95876289] mean value: 0.9590872829919221 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01177764 0.01067424 0.01179218 0.01120353 0.01094937 0.01100659 0.01069164 0.01066709 0.01039362 0.01051068] mean value: 0.010966658592224121 key: score_time value: [0.01029491 0.00953531 0.00973797 0.01007605 0.01004219 0.00984097 0.01000714 0.01029038 0.009624 0.01030111] mean value: 0.009975004196166991 key: test_mcc value: [0.23373675 0.42817442 0.14545455 0.45226702 0.42817442 0.55161872 0.06741999 0.23636364 0.55161872 0.60302269] mean value: 0.36978509083764477 key: train_mcc value: [0.55646909 0.49995455 0.56569532 0.45906255 0.53448943 0.46424351 0.54251375 0.55585218 0.49572783 0.4861571 ] mean value: 0.5160165310554382 key: test_accuracy value: [0.61904762 0.66666667 0.57142857 0.71428571 0.66666667 0.76190476 0.52380952 0.61904762 0.76190476 0.76190476] mean value: 0.6666666666666666 key: train_accuracy value: [0.76719577 0.74074074 0.77248677 0.71957672 0.75661376 0.72486772 0.76190476 0.77248677 0.73544974 0.73544974] mean value: 0.7486772486772486 key: test_fscore value: [0.55555556 0.46153846 0.57142857 0.625 0.46153846 0.73684211 0.44444444 0.63636364 0.73684211 0.70588235] mean value: 0.5935435694336623 key: train_fscore value: [0.73170732 0.7030303 0.73939394 0.67484663 0.7195122 0.68292683 0.72392638 0.74556213 0.6835443 0.69512195] mean value: 0.7099571975217122 key: test_precision value: [0.625 1. 0.54545455 0.83333333 1. 0.875 0.57142857 0.63636364 0.875 1. ] mean value: 0.7961580086580087 key: train_precision value: [0.86956522 0.82857143 0.87142857 0.80882353 0.85507246 0.8 0.85507246 0.84 0.84375 0.81428571] mean value: 0.8386569388625016 key: test_recall value: [0.5 0.3 0.6 0.5 0.3 0.63636364 0.36363636 0.63636364 0.63636364 0.54545455] mean value: 0.5018181818181818 key: train_recall value: [0.63157895 0.61052632 0.64210526 0.57894737 0.62105263 0.59574468 0.62765957 0.67021277 0.57446809 0.60638298] mean value: 0.6158678611422173 key: test_roc_auc value: [0.61363636 0.65 0.57272727 0.70454545 0.65 0.76818182 0.53181818 0.61818182 0.76818182 0.77272727] mean value: 0.665 key: train_roc_auc value: [0.76791713 0.74143337 0.77318029 0.72032475 0.75733483 0.72418813 0.76119821 0.77194849 0.73460246 0.73477044] mean value: 0.7486898096304592 key: test_jcc value: [0.38461538 0.3 0.4 0.45454545 0.3 0.58333333 0.28571429 0.46666667 0.58333333 0.54545455] mean value: 0.43036630036630036 key: train_jcc value: [0.57692308 0.54205607 0.58653846 0.50925926 0.56190476 0.51851852 0.56730769 0.59433962 0.51923077 0.53271028] mean value: 0.5508788517464236 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [1.0007422 1.08978605 0.60352254 0.19705272 0.85835528 0.39761066 0.57176566 0.29660773 0.19528532 0.50309563] mean value: 0.5713823795318603 key: score_time value: [0.013587 0.01366115 0.0136435 0.01247692 0.01475835 0.01236415 0.01414704 0.01221514 0.01440883 0.01542115] mean value: 0.013668322563171386 key: test_mcc value: [0.82275335 1. 0.39196475 0.80909091 0.90909091 1. 0.80909091 0.90829511 1. 0.71818182] mean value: 0.8368467750842328 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 1. 0.66666667 0.9047619 0.95238095 1. 0.9047619 0.95238095 1. 0.85714286] mean value: 0.9142857142857143 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 1. 0.72 0.9 0.95238095 1. 0.90909091 0.95652174 1. 0.85714286] mean value: 0.9184025346634043 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.6 0.9 0.90909091 1. 0.90909091 0.91666667 1. 0.9 ] mean value: 0.9134848484848485 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 1. 0.9 0.9 1. 1. 0.90909091 1. 1. 0.81818182] mean value: 0.9327272727272727 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 1. 0.67727273 0.90454545 0.95454545 1. 0.90454545 0.95 1. 0.85909091] mean value: 0.915 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 1. 0.5625 0.81818182 0.90909091 1. 0.83333333 0.91666667 1. 0.75 ] mean value: 0.8589772727272728 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.03846598 0.02921033 0.0438509 0.10852194 0.03383636 0.02990055 0.03976679 0.04133964 0.06952095 0.03217459] mean value: 0.046658802032470706 key: score_time value: [0.04556179 0.01182222 0.03038454 0.01200247 0.01077509 0.01119876 0.01961231 0.01819873 0.01630306 0.01722431] mean value: 0.01930832862854004 key: test_mcc value: [0.43007562 0.52727273 0.33028913 0.71818182 0.74161985 0.71818182 0.33028913 0.80909091 0.82572282 0.63305416] mean value: 0.6063777984708973 key: train_mcc value: [0.96830553 0.96874655 0.97905237 0.96830907 0.96830907 0.95767077 0.97883539 0.96830907 0.94713854 0.96830553] mean value: 0.9672981891345079 key: test_accuracy value: [0.71428571 0.76190476 0.66666667 0.85714286 0.85714286 0.85714286 0.66666667 0.9047619 0.9047619 0.80952381] mean value: 0.7999999999999999 key: train_accuracy value: [0.98412698 0.98412698 0.98941799 0.98412698 0.98412698 0.97883598 0.98941799 0.98412698 0.97354497 0.98412698] mean value: 0.9835978835978836 key: test_fscore value: [0.66666667 0.76190476 0.63157895 0.85714286 0.82352941 0.85714286 0.69565217 0.90909091 0.9 0.8 ] mean value: 0.7902708584994222 key: train_fscore value: [0.98429319 0.98395722 0.98958333 0.98412698 0.98412698 0.9787234 0.9893617 0.98412698 0.97326203 0.98395722] mean value: 0.9835519056402777 key: test_precision value: [0.75 0.72727273 0.66666667 0.81818182 1. 0.9 0.66666667 0.90909091 1. 0.88888889] mean value: 0.8326767676767677 key: train_precision value: [0.97916667 1. 0.97938144 0.9893617 0.9893617 0.9787234 0.9893617 0.97894737 0.97849462 0.98924731] mean value: 0.9852045924508858 key: test_recall value: [0.6 0.8 0.6 0.9 0.7 0.81818182 0.72727273 0.90909091 0.81818182 0.72727273] mean value: 0.76 key: train_recall value: [0.98947368 0.96842105 1. 0.97894737 0.97894737 0.9787234 0.9893617 0.9893617 0.96808511 0.9787234 ] mean value: 0.9820044792833147 key: test_roc_auc value: [0.70909091 0.76363636 0.66363636 0.85909091 0.85 0.85909091 0.66363636 0.90454545 0.90909091 0.81363636] mean value: 0.7995454545454546 key: train_roc_auc value: [0.98409854 0.98421053 0.9893617 0.98415454 0.98415454 0.97883539 0.98941769 0.98415454 0.97351624 0.98409854] mean value: 0.9836002239641657 key: test_jcc value: [0.5 0.61538462 0.46153846 0.75 0.7 0.75 0.53333333 0.83333333 0.81818182 0.66666667] mean value: 0.6628438228438228 key: train_jcc value: [0.96907216 0.96842105 0.97938144 0.96875 0.96875 0.95833333 0.97894737 0.96875 0.94791667 0.96842105] mean value: 0.9676743081931634 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.01121068 0.01103139 0.01107359 0.01116538 0.0111959 0.0111444 0.01113462 0.0111568 0.01111794 0.01115346] mean value: 0.011138415336608887 key: score_time value: [0.0101912 0.01027894 0.01037192 0.01032901 0.01054072 0.01028752 0.01039696 0.01047206 0.01040912 0.01026344] mean value: 0.010354089736938476 key: test_mcc value: [0.33028913 0.05504819 0.39196475 0.43007562 0.44038551 0.63305416 0.05504819 0.24120908 0.52295779 0.23636364] mean value: 0.3336396040864597 key: train_mcc value: [0.43065616 0.50382186 0.43289183 0.41958895 0.45044462 0.43991059 0.42961362 0.49404873 0.47201413 0.46267525] mean value: 0.45356657385003807 key: test_accuracy value: [0.66666667 0.52380952 0.66666667 0.71428571 0.71428571 0.80952381 0.52380952 0.61904762 0.76190476 0.61904762] mean value: 0.6619047619047619 key: train_accuracy value: [0.71428571 0.75132275 0.71428571 0.70899471 0.72486772 0.71957672 0.71428571 0.74603175 0.73544974 0.73015873] mean value: 0.7259259259259259 key: test_fscore value: [0.63157895 0.54545455 0.72 0.66666667 0.72727273 0.8 0.5 0.69230769 0.7826087 0.63636364] mean value: 0.6702252911085863 key: train_fscore value: [0.73 0.76142132 0.73529412 0.72361809 0.73469388 0.7253886 0.72164948 0.75510204 0.74226804 0.74111675] mean value: 0.7370552324342122 key: test_precision value: [0.66666667 0.5 0.6 0.75 0.66666667 0.88888889 0.55555556 0.6 0.75 0.63636364] mean value: 0.6614141414141415 key: train_precision value: [0.6952381 0.73529412 0.68807339 0.69230769 0.71287129 0.70707071 0.7 0.7254902 0.72 0.70873786] mean value: 0.7085083354043781 key: test_recall value: [0.6 0.6 0.9 0.6 0.8 0.72727273 0.45454545 0.81818182 0.81818182 0.63636364] mean value: 0.6954545454545454 key: train_recall value: [0.76842105 0.78947368 0.78947368 0.75789474 0.75789474 0.74468085 0.74468085 0.78723404 0.76595745 0.77659574] mean value: 0.7682306830907055 key: test_roc_auc value: [0.66363636 0.52727273 0.67727273 0.70909091 0.71818182 0.81363636 0.52727273 0.60909091 0.75909091 0.61818182] mean value: 0.6622727272727273 key: train_roc_auc value: [0.71399776 0.75111982 0.71388578 0.7087346 0.72469205 0.71970885 0.71444569 0.7462486 0.7356103 0.73040314] mean value: 0.7258846584546472 key: test_jcc value: [0.46153846 0.375 0.5625 0.5 0.57142857 0.66666667 0.33333333 0.52941176 0.64285714 0.46666667] mean value: 0.5109402607196725 key: train_jcc value: [0.57480315 0.6147541 0.58139535 0.56692913 0.58064516 0.56910569 0.56451613 0.60655738 0.59016393 0.58870968] mean value: 0.5837579700936688 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01644301 0.01346326 0.01460147 0.01373553 0.01718259 0.01639462 0.01637864 0.01489592 0.01578617 0.01493788] mean value: 0.015381908416748047 key: score_time value: [0.0110271 0.01079273 0.01092768 0.01038527 0.01035404 0.01038051 0.01045108 0.01053333 0.01052833 0.0103631 ] mean value: 0.01057431697845459 key: test_mcc value: [0.4719399 0.61818182 0.4719399 0.74161985 0.71562645 0.60302269 0.23373675 0.71562645 0.74161985 0.60302269] mean value: 0.5916336343526927 key: train_mcc value: [0.86125076 0.84693232 0.84923609 0.66188185 0.95789003 0.87061974 0.8157737 0.89595041 0.80682683 0.88957791] mean value: 0.8455939633920262 key: test_accuracy value: [0.71428571 0.80952381 0.71428571 0.85714286 0.85714286 0.76190476 0.61904762 0.85714286 0.85714286 0.76190476] mean value: 0.780952380952381 key: train_accuracy value: [0.92592593 0.92063492 0.92063492 0.8042328 0.97883598 0.93121693 0.8994709 0.94708995 0.89417989 0.94179894] mean value: 0.9164021164021163 key: test_fscore value: [0.75 0.8 0.75 0.82352941 0.84210526 0.70588235 0.66666667 0.86956522 0.88 0.70588235] mean value: 0.7793631264862925 key: train_fscore value: [0.93137255 0.92537313 0.92610837 0.75816993 0.9787234 0.92571429 0.90821256 0.94505495 0.90384615 0.93785311] mean value: 0.9140428448974536 key: test_precision value: [0.64285714 0.8 0.64285714 1. 0.88888889 1. 0.61538462 0.83333333 0.78571429 1. ] mean value: 0.8209035409035409 key: train_precision value: [0.87155963 0.87735849 0.87037037 1. 0.98924731 1. 0.83185841 0.97727273 0.8245614 1. ] mean value: 0.9242228343653033 key: test_recall value: [0.9 0.8 0.9 0.7 0.8 0.54545455 0.72727273 0.90909091 1. 0.54545455] mean value: 0.7827272727272727 key: train_recall value: [1. 0.97894737 0.98947368 0.61052632 0.96842105 0.86170213 1. 0.91489362 1. 0.88297872] mean value: 0.9206942889137738 key: test_roc_auc value: [0.72272727 0.80909091 0.72272727 0.85 0.85454545 0.77272727 0.61363636 0.85454545 0.85 0.77272727] mean value: 0.7822727272727272 key: train_roc_auc value: [0.92553191 0.92032475 0.92026876 0.80526316 0.97889138 0.93085106 0.9 0.94692049 0.89473684 0.94148936] mean value: 0.916427771556551 key: test_jcc value: [0.6 0.66666667 0.6 0.7 0.72727273 0.54545455 0.5 0.76923077 0.78571429 0.54545455] mean value: 0.6439793539793539 key: train_jcc value: [0.87155963 0.86111111 0.86238532 0.61052632 0.95833333 0.86170213 0.83185841 0.89583333 0.8245614 0.88297872] mean value: 0.846084970934794 MCC on Blind test: 0.49 Accuracy on Blind test: 0.67 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.0150187 0.01364851 0.01349545 0.01297951 0.01252437 0.01285005 0.01290178 0.0133462 0.01365423 0.01284862] mean value: 0.013326740264892578 key: score_time value: [0.01063037 0.01029801 0.01048899 0.01053691 0.01031756 0.0104537 0.01043105 0.01037478 0.01037979 0.01044369] mean value: 0.01043548583984375 key: test_mcc value: [0.33709993 0.53935989 0.53935989 0.38924947 0.82572282 0.50874702 0.42727273 0.66332496 1. 0.52727273] mean value: 0.5757409438784044 key: train_mcc value: [0.80452249 0.84518345 0.79793785 0.47083798 0.80904214 0.61283493 0.85498064 0.88405964 0.88405964 0.89601922] mean value: 0.7859477973314178 key: test_accuracy value: [0.66666667 0.76190476 0.76190476 0.61904762 0.9047619 0.71428571 0.71428571 0.80952381 1. 0.76190476] mean value: 0.7714285714285715 key: train_accuracy value: [0.8994709 0.92063492 0.88888889 0.68253968 0.8994709 0.77248677 0.92592593 0.94179894 0.94179894 0.94708995] mean value: 0.882010582010582 key: test_fscore value: [0.58823529 0.70588235 0.70588235 0.71428571 0.90909091 0.78571429 0.72727273 0.84615385 1. 0.76190476] mean value: 0.7744422244422244 key: train_fscore value: [0.89385475 0.91712707 0.87573964 0.76 0.90731707 0.81385281 0.92857143 0.94240838 0.94240838 0.94845361] mean value: 0.892973314316607 key: test_precision value: [0.71428571 0.85714286 0.85714286 0.55555556 0.83333333 0.64705882 0.72727273 0.73333333 1. 0.8 ] mean value: 0.7725125201595789 key: train_precision value: [0.95238095 0.96511628 1. 0.61290323 0.84545455 0.68613139 0.89215686 0.92783505 0.92783505 0.92 ] mean value: 0.8729813355410913 key: test_recall value: [0.5 0.6 0.6 1. 1. 1. 0.72727273 1. 1. 0.72727273] mean value: 0.8154545454545454 key: train_recall value: [0.84210526 0.87368421 0.77894737 1. 0.97894737 1. 0.96808511 0.95744681 0.95744681 0.9787234 ] mean value: 0.933538633818589 key: test_roc_auc value: [0.65909091 0.75454545 0.75454545 0.63636364 0.90909091 0.7 0.71363636 0.8 1. 0.76363636] mean value: 0.769090909090909 key: train_roc_auc value: [0.89977604 0.92088466 0.88947368 0.68085106 0.89904815 0.77368421 0.92614782 0.9418813 0.9418813 0.94725644] mean value: 0.8820884658454647 key: test_jcc value: [0.41666667 0.54545455 0.54545455 0.55555556 0.83333333 0.64705882 0.57142857 0.73333333 1. 0.61538462] mean value: 0.6463669990140578 key: train_jcc value: [0.80808081 0.84693878 0.77894737 0.61290323 0.83035714 0.68613139 0.86666667 0.89108911 0.89108911 0.90196078] mean value: 0.8114164376339148 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.14535904 0.14270997 0.1418457 0.14258361 0.14160824 0.14119768 0.14122701 0.15788913 0.1422658 0.14225483] mean value: 0.14389410018920898 key: score_time value: [0.01751041 0.0176034 0.01684165 0.01729679 0.01721644 0.01762438 0.01736569 0.01751566 0.01748872 0.01746202] mean value: 0.017392516136169434 key: test_mcc value: [0.71562645 0.90829511 0.26967994 0.71562645 0.90909091 0.80909091 0.71818182 0.71818182 1. 0.71562645] mean value: 0.7479399847756402 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.85714286 0.95238095 0.61904762 0.85714286 0.95238095 0.9047619 0.85714286 0.85714286 1. 0.85714286] mean value: 0.8714285714285714 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.84210526 0.94736842 0.66666667 0.84210526 0.95238095 0.90909091 0.85714286 0.85714286 1. 0.86956522] mean value: 0.8743568407183968 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.88888889 1. 0.57142857 0.88888889 0.90909091 0.90909091 0.9 0.9 1. 0.83333333] mean value: 0.88007215007215 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.9 0.8 0.8 1. 0.90909091 0.81818182 0.81818182 1. 0.90909091] mean value: 0.8754545454545455 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85454545 0.95 0.62727273 0.85454545 0.95454545 0.90454545 0.85909091 0.85909091 1. 0.85454545] mean value: 0.8718181818181818 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.72727273 0.9 0.5 0.72727273 0.90909091 0.83333333 0.75 0.75 1. 0.76923077] mean value: 0.7866200466200466 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.03950834 0.0361855 0.0394659 0.03514695 0.03398728 0.0323205 0.03702044 0.03738832 0.03468561 0.03310966] mean value: 0.035881853103637694 key: score_time value: [0.01812267 0.01860189 0.01861978 0.08073974 0.01802468 0.01793027 0.02523518 0.01894164 0.01871896 0.01882577] mean value: 0.025376057624816893 key: test_mcc value: [0.62641448 0.82275335 0.4719399 0.90829511 1. 0.90829511 0.80909091 0.90829511 0.82572282 0.90829511] mean value: 0.8189101896090047 key: train_mcc value: [0.97883539 0.97883539 0.98947368 0.98947368 1. 0.98947251 0.98947368 0.97905237 0.98947368 0.98947251] mean value: 0.987356290106085 key: test_accuracy value: [0.80952381 0.9047619 0.71428571 0.95238095 1. 0.95238095 0.9047619 0.95238095 0.9047619 0.95238095] mean value: 0.9047619047619048 key: train_accuracy value: [0.98941799 0.98941799 0.99470899 0.99470899 1. 0.99470899 0.99470899 0.98941799 0.99470899 0.99470899] mean value: 0.9936507936507937 key: test_fscore value: [0.77777778 0.88888889 0.75 0.94736842 1. 0.95652174 0.90909091 0.95652174 0.9 0.95652174] mean value: 0.9042691214201511 key: train_fscore value: [0.98947368 0.98947368 0.99470899 0.99470899 1. 0.99465241 0.99470899 0.98924731 0.99470899 0.99465241] mean value: 0.9936335471919213 key: test_precision value: [0.875 1. 0.64285714 1. 1. 0.91666667 0.90909091 0.91666667 1. 0.91666667] mean value: 0.9176948051948052 key: train_precision value: [0.98947368 0.98947368 1. 1. 1. 1. 0.98947368 1. 0.98947368 1. ] mean value: 0.9957894736842106 key: test_recall value: [0.7 0.8 0.9 0.9 1. 1. 0.90909091 1. 0.81818182 1. ] mean value: 0.9027272727272727 key: train_recall value: [0.98947368 0.98947368 0.98947368 0.98947368 1. 0.9893617 1. 0.9787234 1. 0.9893617 ] mean value: 0.9915341545352744 key: test_roc_auc value: [0.80454545 0.9 0.72272727 0.95 1. 0.95 0.90454545 0.95 0.90909091 0.95 ] mean value: 0.9040909090909091 key: train_roc_auc value: [0.98941769 0.98941769 0.99473684 0.99473684 1. 0.99468085 0.99473684 0.9893617 0.99473684 0.99468085] mean value: 0.9936506159014558 key: test_jcc value: [0.63636364 0.8 0.6 0.9 1. 0.91666667 0.83333333 0.91666667 0.81818182 0.91666667] mean value: 0.8337878787878787 key: train_jcc value: [0.97916667 0.97916667 0.98947368 0.98947368 1. 0.9893617 0.98947368 0.9787234 0.98947368 0.9893617 ] mean value: 0.9873674878686076 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.08917499 0.13380027 0.17171621 0.11417508 0.11032367 0.10746312 0.12534523 0.12104058 0.1084094 0.1078198 ] mean value: 0.11892683506011963 key: score_time value: [0.03536677 0.02449703 0.02127337 0.01806617 0.02497578 0.02218723 0.02117968 0.02874351 0.02377129 0.03750396] mean value: 0.025756478309631348 key: test_mcc value: [0.03739788 0.53935989 0.52295779 0.23636364 0.62641448 0.33028913 0.18090681 0.52727273 0.39196475 0.55161872] mean value: 0.3944545813288632 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.52380952 0.76190476 0.76190476 0.61904762 0.80952381 0.66666667 0.57142857 0.76190476 0.66666667 0.76190476] mean value: 0.6904761904761905 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.44444444 0.70588235 0.73684211 0.6 0.77777778 0.69565217 0.47058824 0.76190476 0.58823529 0.73684211] mean value: 0.6518169250919285 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.5 0.85714286 0.77777778 0.6 0.875 0.66666667 0.66666667 0.8 0.83333333 0.875 ] mean value: 0.7451587301587301 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.4 0.6 0.7 0.6 0.7 0.72727273 0.36363636 0.72727273 0.45454545 0.63636364] mean value: 0.5909090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.51818182 0.75454545 0.75909091 0.61818182 0.80454545 0.66363636 0.58181818 0.76363636 0.67727273 0.76818182] mean value: 0.6909090909090909 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.28571429 0.54545455 0.58333333 0.42857143 0.63636364 0.53333333 0.30769231 0.61538462 0.41666667 0.58333333] mean value: 0.4935847485847486 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.46492267 0.45418954 0.45839477 0.45019031 0.45531964 0.45568514 0.44885039 0.45692539 0.45565152 0.44814992] mean value: 0.45482792854309084 key: score_time value: [0.01082897 0.01071215 0.01069975 0.0107615 0.01069736 0.01068664 0.01067924 0.01079154 0.01067519 0.01112723] mean value: 0.010765957832336425 key: test_mcc value: [0.90829511 1. 0.4719399 0.90909091 0.90909091 0.90829511 0.80909091 0.90829511 1. 0.90829511] mean value: 0.8732393055913986 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95238095 1. 0.71428571 0.95238095 0.95238095 0.95238095 0.9047619 0.95238095 1. 0.95238095] mean value: 0.9333333333333333 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 1. 0.75 0.95238095 0.95238095 0.95652174 0.90909091 0.95652174 1. 0.95652174] mean value: 0.938078645229675 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.64285714 0.90909091 0.90909091 0.91666667 0.90909091 0.91666667 1. 0.91666667] mean value: 0.9120129870129869 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 0.9 1. 1. 1. 0.90909091 1. 1. 1. ] mean value: 0.9709090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.95 1. 0.72272727 0.95454545 0.95454545 0.95 0.90454545 0.95 1. 0.95 ] mean value: 0.9336363636363636 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 1. 0.6 0.90909091 0.90909091 0.91666667 0.83333333 0.91666667 1. 0.91666667] mean value: 0.8901515151515151 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.07232475 0.05349302 0.05705237 0.06523085 0.04446149 0.04642963 0.0278461 0.02402425 0.02402234 0.0231905 ] mean value: 0.04380753040313721 key: score_time value: [0.02672768 0.02717161 0.02339268 0.01948118 0.02111459 0.0229249 0.01305509 0.01254392 0.01267433 0.01560426] mean value: 0.019469022750854492 key: test_mcc value: [0.55161872 0.74795759 0.74795759 0.55161872 0.82572282 0.90829511 0.74161985 0.66332496 0.90829511 0.80909091] mean value: 0.7455501384398332 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.76190476 0.85714286 0.85714286 0.76190476 0.9047619 0.95238095 0.85714286 0.80952381 0.95238095 0.9047619 ] mean value: 0.8619047619047618 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.7826087 0.86956522 0.86956522 0.7826087 0.90909091 0.95652174 0.88 0.84615385 0.95652174 0.90909091] mean value: 0.876172696868349 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.69230769 0.76923077 0.76923077 0.69230769 0.83333333 0.91666667 0.78571429 0.73333333 0.91666667 0.90909091] mean value: 0.8017882117882118 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 1. 1. 0.9 1. 1. 1. 1. 1. 0.90909091] mean value: 0.9709090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.76818182 0.86363636 0.86363636 0.76818182 0.90909091 0.95 0.85 0.8 0.95 0.90454545] mean value: 0.8627272727272727 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.64285714 0.76923077 0.76923077 0.64285714 0.83333333 0.91666667 0.78571429 0.73333333 0.91666667 0.83333333] mean value: 0.7843223443223444 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.08 Accuracy on Blind test: 0.6 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.047019 0.05411768 0.05120516 0.08179045 0.03637123 0.03757405 0.0525198 0.0249393 0.07093072 0.05749607] mean value: 0.05139634609222412 key: score_time value: [0.06786156 0.01952934 0.03211212 0.01617074 0.04544282 0.0289197 0.03419495 0.01245379 0.02793956 0.02250385] mean value: 0.03071284294128418 key: test_mcc value: [0.52295779 0.80909091 0.44038551 1. 0.80909091 0.82572282 0.55161872 0.82275335 1. 0.63305416] mean value: 0.7414674176772243 key: train_mcc value: [0.97883539 0.95789003 0.94713854 0.9264031 0.94714446 0.94757483 0.96830553 0.94757483 0.93650616 0.93736014] mean value: 0.9494732992955751 key: test_accuracy value: [0.76190476 0.9047619 0.71428571 1. 0.9047619 0.9047619 0.76190476 0.9047619 1. 0.80952381] mean value: 0.8666666666666667 key: train_accuracy value: [0.98941799 0.97883598 0.97354497 0.96296296 0.97354497 0.97354497 0.98412698 0.97354497 0.96825397 0.96825397] mean value: 0.9746031746031746 key: test_fscore value: [0.73684211 0.9 0.72727273 1. 0.9 0.9 0.73684211 0.91666667 1. 0.8 ] mean value: 0.861762360446571 key: train_fscore value: [0.98947368 0.9787234 0.97382199 0.96256684 0.97354497 0.97382199 0.98395722 0.97382199 0.96808511 0.96875 ] mean value: 0.9746567201151308 key: test_precision value: [0.77777778 0.9 0.66666667 1. 0.9 1. 0.875 0.84615385 1. 0.88888889] mean value: 0.8854487179487179 key: train_precision value: [0.98947368 0.98924731 0.96875 0.97826087 0.9787234 0.95876289 0.98924731 0.95876289 0.96808511 0.94897959] mean value: 0.9728293053102567 key: test_recall value: [0.7 0.9 0.8 1. 0.9 0.81818182 0.63636364 1. 1. 0.72727273] mean value: 0.8481818181818181 key: train_recall value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:128: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:131: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.98947368 0.96842105 0.97894737 0.94736842 0.96842105 0.9893617 0.9787234 0.9893617 0.96808511 0.9893617 ] mean value: 0.9767525195968645 key: test_roc_auc value: [0.75909091 0.90454545 0.71818182 1. 0.90454545 0.90909091 0.76818182 0.9 1. 0.81363636] mean value: 0.8677272727272727 key: train_roc_auc value: [0.98941769 0.97889138 0.97351624 0.96304591 0.97357223 0.97362822 0.98409854 0.97362822 0.96825308 0.96836506] mean value: 0.9746416573348264 key: test_jcc value: [0.58333333 0.81818182 0.57142857 1. 0.81818182 0.81818182 0.58333333 0.84615385 1. 0.66666667] mean value: 0.7705461205461206 key: train_jcc value: [0.97916667 0.95833333 0.94897959 0.92783505 0.94845361 0.94897959 0.96842105 0.94897959 0.93814433 0.93939394] mean value: 0.9506686757226445 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.24432898 0.38443637 0.46159816 0.37851357 0.30649304 0.50042582 0.33015323 0.36040902 0.35314488 0.34496379] mean value: 0.36644668579101564 key: score_time value: [0.02193141 0.03222609 0.03092313 0.02275586 0.02286959 0.02013755 0.02346706 0.01971507 0.01932216 0.02431846] mean value: 0.023766636848449707 key: test_mcc value: [0.62641448 0.71562645 0.52295779 1. 0.80909091 0.52727273 0.42727273 0.82275335 1. 0.63305416] mean value: 0.7084442598864285 key: train_mcc value: [0.98947251 0.94714446 0.95767077 0.93650616 0.93650616 0.95789003 0.95789003 0.94757483 0.93650616 0.93736014] mean value: 0.9504521239520874 key: test_accuracy value: [0.80952381 0.85714286 0.76190476 1. 0.9047619 0.76190476 0.71428571 0.9047619 1. 0.80952381] mean value: 0.8523809523809524 key: train_accuracy value: [0.99470899 0.97354497 0.97883598 0.96825397 0.96825397 0.97883598 0.97883598 0.97354497 0.96825397 0.96825397] mean value: 0.9751322751322751 key: test_fscore value: [0.77777778 0.84210526 0.73684211 1. 0.9 0.76190476 0.72727273 0.91666667 1. 0.8 ] mean value: 0.8462569302042986 key: train_fscore value: [0.9947644 0.97354497 0.97894737 0.96842105 0.96842105 0.97894737 0.97894737 0.97382199 0.96808511 0.96875 ] mean value: 0.9752650677888823 key: test_precision value: [0.875 0.88888889 0.77777778 1. 0.9 0.8 0.72727273 0.84615385 1. 0.88888889] mean value: 0.8703982128982128 key: train_precision value: [0.98958333 0.9787234 0.97894737 0.96842105 0.96842105 0.96875 0.96875 0.95876289 0.96808511 0.94897959] mean value: 0.9697423796090515 key: test_recall value: [0.7 0.8 0.7 1. 0.9 0.72727273 0.72727273 1. 1. 0.72727273] mean value: 0.8281818181818181 key: train_recall value: [1. 0.96842105 0.97894737 0.96842105 0.96842105 0.9893617 0.9893617 0.9893617 0.96808511 0.9893617 ] mean value: 0.9809742441209407 key: test_roc_auc value: [0.80454545 0.85454545 0.75909091 1. 0.90454545 0.76363636 0.71363636 0.9 1. 0.81363636] mean value: 0.8513636363636363 key: train_roc_auc value: [0.99468085 0.97357223 0.97883539 0.96825308 0.96825308 0.97889138 0.97889138 0.97362822 0.96825308 0.96836506] mean value: 0.9751623740201567 key: test_jcc value: [0.63636364 0.72727273 0.58333333 1. 0.81818182 0.61538462 0.57142857 0.84615385 1. 0.66666667] mean value: 0.7464785214785215 key: train_jcc value: [0.98958333 0.94845361 0.95876289 0.93877551 0.93877551 0.95876289 0.95876289 0.94897959 0.93814433 0.93939394] mean value: 0.9518394482910315 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03008127 0.03638411 0.03371334 0.03461933 0.088166 0.07241488 0.03378487 0.06745028 0.06802797 0.06680107] mean value: 0.053144311904907225 key: score_time value: [0.01519585 0.01250982 0.01437712 0.01443815 0.02278161 0.01534963 0.01221824 0.02406025 0.01457262 0.01535487] mean value: 0.0160858154296875 key: test_mcc value: [0.33028913 0.62641448 0.44038551 0.61818182 0.82275335 0.80909091 0.55161872 0.80909091 0.42727273 0.71818182] mean value: 0.6153279376024733 key: train_mcc value: [0.8738236 0.8314659 0.83068309 0.862486 0.8738236 0.87319373 0.90480458 0.88402082 0.86284197 0.89438907] mean value: 0.8691532351249575 key: test_accuracy value: [0.66666667 0.80952381 0.71428571 0.80952381 0.9047619 0.9047619 0.76190476 0.9047619 0.71428571 0.85714286] mean value: 0.8047619047619048 key: train_accuracy value: [0.93650794 0.91534392 0.91534392 0.93121693 0.93650794 0.93650794 0.95238095 0.94179894 0.93121693 0.94708995] mean value: 0.9343915343915343 key: test_fscore value: [0.63157895 0.77777778 0.72727273 0.8 0.88888889 0.90909091 0.73684211 0.90909091 0.72727273 0.85714286] mean value: 0.7964957849168376 key: train_fscore value: [0.93548387 0.91397849 0.91578947 0.93121693 0.93548387 0.93548387 0.95187166 0.94054054 0.92972973 0.94736842] mean value: 0.9336946861504937 key: test_precision value: [0.66666667 0.875 0.66666667 0.8 1. 0.90909091 0.875 0.90909091 0.72727273 0.9 ] mean value: 0.8328787878787879 key: train_precision value: [0.95604396 0.93406593 0.91578947 0.93617021 0.95604396 0.94565217 0.95698925 0.95604396 0.94505495 0.9375 ] mean value: 0.9439353854927787 key: test_recall value: [0.6 0.7 0.8 0.8 0.8 0.90909091 0.63636364 0.90909091 0.72727273 0.81818182] mean value: 0.77 key: train_recall value: [0.91578947 0.89473684 0.91578947 0.92631579 0.91578947 0.92553191 0.94680851 0.92553191 0.91489362 0.95744681] mean value: 0.9238633818589026 key: test_roc_auc value: [0.66363636 0.80454545 0.71818182 0.80909091 0.9 0.90454545 0.76818182 0.90454545 0.71363636 0.85909091] mean value: 0.8045454545454546 key: train_roc_auc value: [0.93661814 0.91545353 0.91534155 0.931243 0.93661814 0.93645017 0.95235162 0.94171333 0.93113102 0.94714446] mean value: 0.9344064949608063 key: test_jcc value: [0.46153846 0.63636364 0.57142857 0.66666667 0.8 0.83333333 0.58333333 0.83333333 0.57142857 0.75 ] mean value: 0.6707425907425908 key: train_jcc value: [0.87878788 0.84158416 0.84466019 0.87128713 0.87878788 0.87878788 0.90816327 0.8877551 0.86868687 0.9 ] mean value: 0.8758500353700914 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [1.30685544 1.22656345 2.2850945 1.44909978 0.99897575 0.93024874 0.82502961 1.04818726 1.14439416 0.94249988] mean value: 1.2156948566436767 key: score_time value: [0.05694318 0.05938172 0.03290272 0.0146718 0.01458669 0.01460767 0.01712728 0.01470828 0.01468945 0.01212597] mean value: 0.025174474716186522 key: test_mcc value: [0.61818182 0.71562645 0.52295779 1. 0.90829511 0.74795759 0.55161872 1. 0.71818182 0.67419986] mean value: 0.7457019156935036 key: train_mcc value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 key: test_accuracy value: [0.80952381 0.85714286 0.76190476 1. 0.95238095 0.85714286 0.76190476 1. 0.85714286 0.80952381] mean value: 0.8666666666666667 key: train_accuracy value: [1. 0.99470899 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994708994708995 key: test_fscore value: [0.8 0.84210526 0.73684211 1. 0.94736842 0.84210526 0.73684211 1. 0.85714286 0.77777778] mean value: 0.8540183792815372 key: train_fscore value: [1. 0.99470899 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994708994708995 key: test_precision value: [0.8 0.88888889 0.77777778 1. 1. 1. 0.875 1. 0.9 1. ] mean value: 0.9241666666666667 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.8 0.7 1. 0.9 0.72727273 0.63636364 1. 0.81818182 0.63636364] mean value: 0.8018181818181819 key: train_recall value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 key: test_roc_auc value: [0.80909091 0.85454545 0.75909091 1. 0.95 0.86363636 0.76818182 1. 0.85909091 0.81818182] mean value: 0.8681818181818182 key: train_roc_auc value: [1. 0.99473684 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994736842105263 key: test_jcc value: [0.66666667 0.72727273 0.58333333 1. 0.9 0.72727273 0.58333333 1. 0.75 0.63636364] mean value: 0.7574242424242424 key: train_jcc value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01341915 0.00960445 0.009269 0.00893736 0.00891161 0.0090003 0.00897193 0.00950503 0.00953388 0.01000404] mean value: 0.009715676307678223 key: score_time value: [0.01628661 0.01169658 0.00924897 0.00879931 0.00866485 0.00861454 0.00882459 0.00885248 0.00950265 0.0094347 ] mean value: 0.009992527961730956 key: test_mcc value: [ 0.35527986 -0.23373675 0.11677484 0.23373675 0.39196475 0.50874702 0.42727273 0.33709993 0.15569979 0.80909091] mean value: 0.3101929821275265 key: train_mcc value: [0.4223863 0.42057994 0.42563559 0.42871542 0.43824416 0.39871188 0.49053012 0.44175632 0.4436004 0.39053852] mean value: 0.4300698660601598 key: test_accuracy value: [0.66666667 0.38095238 0.52380952 0.61904762 0.66666667 0.71428571 0.71428571 0.66666667 0.57142857 0.9047619 ] mean value: 0.6428571428571428 key: train_accuracy value: [0.6984127 0.69312169 0.7037037 0.71428571 0.70899471 0.68783069 0.73544974 0.7037037 0.69312169 0.67724868] mean value: 0.7015873015873015 key: test_fscore value: [0.69565217 0.43478261 0.64285714 0.55555556 0.72 0.78571429 0.72727273 0.72 0.68965517 0.90909091] mean value: 0.688058057551311 key: train_fscore value: [0.74439462 0.74561404 0.74311927 0.71276596 0.74885845 0.73059361 0.76635514 0.75 0.75213675 0.73127753] mean value: 0.7425115357581491 key: test_precision value: [0.61538462 0.38461538 0.5 0.625 0.6 0.64705882 0.72727273 0.64285714 0.55555556 0.90909091] mean value: 0.6206835158305747 key: train_precision value: [0.6484375 0.63909774 0.65853659 0.72043011 0.66129032 0.64 0.68333333 0.64615385 0.62857143 0.62406015] mean value: 0.654991101826883 key: test_recall value: [0.8 0.5 0.9 0.5 0.9 1. 0.72727273 0.81818182 0.90909091 0.90909091] mean value: 0.7963636363636364 key: train_recall value: [0.87368421 0.89473684 0.85263158 0.70526316 0.86315789 0.85106383 0.87234043 0.89361702 0.93617021 0.88297872] mean value: 0.8625643896976484 key: test_roc_auc value: [0.67272727 0.38636364 0.54090909 0.61363636 0.67727273 0.7 0.71363636 0.65909091 0.55454545 0.90454545] mean value: 0.6422727272727272 key: train_roc_auc value: [0.6974804 0.69204927 0.70291153 0.71433371 0.70817469 0.68868981 0.73617021 0.70470325 0.6944009 0.67833147] mean value: 0.7017245240761478 key: test_jcc value: [0.53333333 0.27777778 0.47368421 0.38461538 0.5625 0.64705882 0.57142857 0.5625 0.52631579 0.83333333] mean value: 0.5372547224017812 key: train_jcc value: [0.59285714 0.59440559 0.59124088 0.55371901 0.59854015 0.57553957 0.62121212 0.6 0.60273973 0.57638889] mean value: 0.5906643071898742 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.00971127 0.00953364 0.00909424 0.00915623 0.00902915 0.00920463 0.00920725 0.00904489 0.00909853 0.00923586] mean value: 0.0092315673828125 key: score_time value: [0.00908184 0.00872087 0.00866318 0.00864768 0.0086987 0.00893164 0.00869703 0.00877166 0.00874424 0.00874138] mean value: 0.008769822120666505 key: test_mcc value: [ 0.24771685 -0.08528029 0.26967994 0.43007562 0.45226702 0.24771685 0.15894099 0.33028913 0.13762047 0.30914104] mean value: 0.24981676122162266 key: train_mcc value: [0.47383838 0.43945337 0.49511046 0.41111248 0.4606251 0.46213311 0.45044462 0.43913092 0.44972004 0.47093091] mean value: 0.45524993919694523 key: test_accuracy value: [0.61904762 0.47619048 0.61904762 0.71428571 0.71428571 0.61904762 0.57142857 0.66666667 0.57142857 0.61904762] mean value: 0.6190476190476191 key: train_accuracy value: [0.73544974 0.71957672 0.74603175 0.7037037 0.73015873 0.73015873 0.72486772 0.71957672 0.72486772 0.73544974] mean value: 0.726984126984127 key: test_fscore value: [0.63636364 0.26666667 0.66666667 0.66666667 0.625 0.6 0.52631579 0.69565217 0.60869565 0.5 ] mean value: 0.5792027251924277 key: train_fscore value: [0.72222222 0.71657754 0.73333333 0.68539326 0.72727273 0.7150838 0.71428571 0.71657754 0.72340426 0.7311828 ] mean value: 0.7185333185655622 key: test_precision value: [0.58333333 0.4 0.57142857 0.75 0.83333333 0.66666667 0.625 0.66666667 0.58333333 0.8 ] mean value: 0.6479761904761905 key: train_precision value: [0.76470588 0.72826087 0.77647059 0.73493976 0.73913043 0.75294118 0.73863636 0.72043011 0.72340426 0.73913043] mean value: 0.7418049871707797 key: test_recall value: [0.7 0.2 0.8 0.6 0.5 0.54545455 0.45454545 0.72727273 0.63636364 0.36363636] mean value: 0.5527272727272727 key: train_recall value: [0.68421053 0.70526316 0.69473684 0.64210526 0.71578947 0.68085106 0.69148936 0.71276596 0.72340426 0.72340426] mean value: 0.6974020156774916 key: test_roc_auc value: [0.62272727 0.46363636 0.62727273 0.70909091 0.70454545 0.62272727 0.57727273 0.66363636 0.56818182 0.63181818] mean value: 0.6190909090909091 key: train_roc_auc value: [0.73572228 0.71965286 0.74630459 0.70403135 0.73023516 0.72989922 0.72469205 0.71954087 0.72486002 0.73538634] mean value: 0.7270324748040313 key: test_jcc value: [0.46666667 0.15384615 0.5 0.5 0.45454545 0.42857143 0.35714286 0.53333333 0.4375 0.33333333] mean value: 0.4164939227439227 key: train_jcc value: [0.56521739 0.55833333 0.57894737 0.52136752 0.57142857 0.55652174 0.55555556 0.55833333 0.56666667 0.57627119] mean value: 0.5608642666981495 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.00931072 0.00956655 0.00860357 0.00860739 0.00866055 0.00952816 0.00955248 0.00887895 0.00881219 0.00958371] mean value: 0.009110426902770996 key: score_time value: [0.01484823 0.0155077 0.01463413 0.014539 0.01421666 0.01497364 0.01540303 0.01495123 0.01449418 0.01488686] mean value: 0.014845466613769532 key: test_mcc value: [-0.23636364 0.24120908 -0.14545455 0.42727273 -0.15894099 0.04545455 0.08528029 0.13762047 0.15894099 0.08528029] mean value: 0.06402992102965852 key: train_mcc value: [0.50296855 0.39710991 0.42871542 0.39710991 0.3454297 0.51484568 0.49237699 0.43913092 0.44975918 0.39915366] mean value: 0.43665999514803383 key: test_accuracy value: [0.38095238 0.61904762 0.42857143 0.71428571 0.42857143 0.52380952 0.52380952 0.57142857 0.57142857 0.52380952] mean value: 0.5285714285714286 key: train_accuracy value: [0.75132275 0.6984127 0.71428571 0.6984127 0.67195767 0.75661376 0.74603175 0.71957672 0.72486772 0.6984127 ] mean value: 0.717989417989418 key: test_fscore value: [0.38095238 0.5 0.4 0.7 0.33333333 0.54545455 0.375 0.60869565 0.52631579 0.375 ] mean value: 0.4744751701387857 key: train_fscore value: [0.7486631 0.69518717 0.71276596 0.69518717 0.65934066 0.74444444 0.73913043 0.71657754 0.72043011 0.6779661 ] mean value: 0.710969267849835 key: test_precision value: [0.36363636 0.66666667 0.4 0.7 0.375 0.54545455 0.6 0.58333333 0.625 0.6 ] mean value: 0.5459090909090909 key: train_precision value: [0.76086957 0.70652174 0.72043011 0.70652174 0.68965517 0.77906977 0.75555556 0.72043011 0.72826087 0.72289157] mean value: 0.7290206189773512 key: test_recall value: [0.4 0.4 0.4 0.7 0.3 0.54545455 0.27272727 0.63636364 0.45454545 0.27272727] mean value: 0.4381818181818182 key: train_recall value: [0.73684211 0.68421053 0.70526316 0.68421053 0.63157895 0.71276596 0.72340426 0.71276596 0.71276596 0.63829787] mean value: 0.6942105263157895 key: test_roc_auc value: [0.38181818 0.60909091 0.42727273 0.71363636 0.42272727 0.52272727 0.53636364 0.56818182 0.57727273 0.53636364] mean value: 0.5295454545454545 key: train_roc_auc value: [0.75139978 0.69848824 0.71433371 0.69848824 0.67217245 0.75638298 0.74591265 0.71954087 0.72480403 0.6980963 ] mean value: 0.7179619260918253 key: test_jcc value: [0.23529412 0.33333333 0.25 0.53846154 0.2 0.375 0.23076923 0.4375 0.35714286 0.23076923] mean value: 0.31882703081232494 key: train_jcc value: [0.5982906 0.53278689 0.55371901 0.53278689 0.49180328 0.59292035 0.5862069 0.55833333 0.56302521 0.51282051] mean value: 0.5522692962507294 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01352286 0.01665878 0.01201367 0.01270628 0.01311684 0.0131042 0.01292396 0.01178217 0.01296258 0.01138806] mean value: 0.013017940521240234 key: score_time value: [0.011729 0.00963879 0.01002765 0.01018095 0.01015759 0.01019979 0.01010227 0.00988412 0.01014733 0.00937819] mean value: 0.010144567489624024 key: test_mcc value: [0.26967994 0.03739788 0.30914104 0.61818182 0.33636364 0.42727273 0.24771685 0.52295779 0.33709993 0.82572282] mean value: 0.3931534435784296 key: train_mcc value: [0.73576888 0.75666293 0.74744848 0.75694773 0.73585755 0.75666293 0.80967855 0.82013664 0.80967855 0.8102023 ] mean value: 0.7739044547190053 key: test_accuracy value: [0.61904762 0.52380952 0.61904762 0.80952381 0.66666667 0.71428571 0.61904762 0.76190476 0.66666667 0.9047619 ] mean value: 0.6904761904761905 key: train_accuracy value: [0.86772487 0.87830688 0.87301587 0.87830688 0.86772487 0.87830688 0.9047619 0.91005291 0.9047619 0.9047619 ] mean value: 0.8867724867724868 key: test_fscore value: [0.66666667 0.44444444 0.69230769 0.8 0.66666667 0.72727273 0.6 0.7826087 0.72 0.9 ] mean value: 0.6999966893010371 key: train_fscore value: [0.87046632 0.87830688 0.87755102 0.88082902 0.86631016 0.87830688 0.90322581 0.90909091 0.90322581 0.90217391] mean value: 0.8869486709274905 key: test_precision value: [0.57142857 0.5 0.5625 0.8 0.63636364 0.72727273 0.66666667 0.75 0.64285714 1. ] mean value: 0.6857088744588744 key: train_precision value: [0.85714286 0.88297872 0.85148515 0.86734694 0.88043478 0.87368421 0.91304348 0.91397849 0.91304348 0.92222222] mean value: 0.8875360334340103 key: test_recall value: [0.8 0.4 0.9 0.8 0.7 0.72727273 0.54545455 0.81818182 0.81818182 0.81818182] mean value: 0.7327272727272728 key: train_recall value: [0.88421053 0.87368421 0.90526316 0.89473684 0.85263158 0.88297872 0.89361702 0.90425532 0.89361702 0.88297872] mean value: 0.8867973124300113 key: test_roc_auc value: [0.62727273 0.51818182 0.63181818 0.80909091 0.66818182 0.71363636 0.62272727 0.75909091 0.65909091 0.90909091] mean value: 0.6918181818181819 key: train_roc_auc value: [0.86763718 0.87833147 0.87284434 0.87821948 0.86780515 0.87833147 0.90470325 0.9100224 0.90470325 0.90464726] mean value: 0.8867245240761478 key: test_jcc value: [0.5 0.28571429 0.52941176 0.66666667 0.5 0.57142857 0.42857143 0.64285714 0.5625 0.81818182] mean value: 0.5505331678125795 key: train_jcc value: [0.7706422 0.78301887 0.78181818 0.78703704 0.76415094 0.78301887 0.82352941 0.83333333 0.82352941 0.82178218] mean value: 0.7971860435015932 MCC on Blind test: 0.43 Accuracy on Blind test: 0.73 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [0.80780959 1.64683247 2.04292989 2.00777054 2.20077896 2.11120558 1.75354791 1.55821252 1.74798012 1.62602258] mean value: 1.7503090143203734 key: score_time value: [0.01250052 0.03408003 0.01242089 0.0147202 0.02710557 0.03695774 0.02797961 0.02157736 0.03396821 0.04308009] mean value: 0.0264390230178833 key: test_mcc value: [0.23636364 0.62641448 0.63305416 0.71562645 0.71562645 0.60302269 0.4719399 0.90909091 0.52727273 0.82572282] mean value: 0.6264134232369432 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.80952381 0.80952381 0.85714286 0.85714286 0.76190476 0.71428571 0.95238095 0.76190476 0.9047619 ] mean value: 0.8047619047619048 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.6 0.77777778 0.81818182 0.84210526 0.84210526 0.70588235 0.66666667 0.95238095 0.76190476 0.9 ] mean value: 0.7867004856168943 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.6 0.875 0.75 0.88888889 0.88888889 1. 0.85714286 1. 0.8 1. ] mean value: 0.8659920634920635 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.7 0.9 0.8 0.8 0.54545455 0.54545455 0.90909091 0.72727273 0.81818182] mean value: 0.7345454545454545 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61818182 0.80454545 0.81363636 0.85454545 0.85454545 0.77272727 0.72272727 0.95454545 0.76363636 0.90909091] mean value: 0.8068181818181819 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.42857143 0.63636364 0.69230769 0.72727273 0.72727273 0.54545455 0.5 0.90909091 0.61538462 0.81818182] mean value: 0.65999000999001 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.03607321 0.01854372 0.01873636 0.01854372 0.01878786 0.01843524 0.0188539 0.0193491 0.01886439 0.01850581] mean value: 0.02046933174133301 key: score_time value: [0.0126884 0.0127058 0.01279783 0.0127821 0.01274991 0.01255703 0.01291847 0.01300454 0.01293087 0.01302671] mean value: 0.012816166877746582 key: test_mcc value: [0.82275335 0.82275335 0.71818182 1. 1. 0.90909091 0.52727273 1. 0.82572282 0.90909091] mean value: 0.8534865889896018 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.9047619 0.85714286 1. 1. 0.95238095 0.76190476 1. 0.9047619 0.95238095] mean value: 0.9238095238095237 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.88888889 0.85714286 1. 1. 0.95238095 0.76190476 1. 0.9 0.95238095] mean value: 0.9201587301587302 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.81818182 1. 1. 1. 0.8 1. 1. 1. ] mean value: 0.9618181818181818 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.8 0.9 1. 1. 0.90909091 0.72727273 1. 0.81818182 0.90909091] mean value: 0.8863636363636364 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.9 0.85909091 1. 1. 0.95454545 0.76363636 1. 0.90909091 0.95454545] mean value: 0.9240909090909091 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.8 0.75 1. 1. 0.90909091 0.61538462 1. 0.81818182 0.90909091] mean value: 0.8601748251748251 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.49 Accuracy on Blind test: 0.73 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.14116931 0.14187193 0.14279318 0.14259577 0.14345264 0.14271784 0.14100552 0.1444006 0.28073263 0.1378026 ] mean value: 0.1558542013168335 key: score_time value: [0.02478743 0.02508497 0.02502012 0.02513218 0.02525902 0.02499199 0.02501416 0.02670145 0.0441525 0.02399564] mean value: 0.027013945579528808 key: test_mcc value: [0.33636364 0.33028913 0.63305416 0.71562645 1. 0.90909091 0.55161872 0.90909091 0.90829511 0.90909091] mean value: 0.7202519935788301 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.66666667 0.66666667 0.80952381 0.85714286 1. 0.95238095 0.76190476 0.95238095 0.95238095 0.95238095] mean value: 0.8571428571428571 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.63157895 0.81818182 0.84210526 1. 0.95238095 0.73684211 0.95238095 0.95652174 0.95238095] mean value: 0.850903939691125 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.63636364 0.66666667 0.75 0.88888889 1. 1. 0.875 1. 0.91666667 1. ] mean value: 0.8733585858585858 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.7 0.6 0.9 0.8 1. 0.90909091 0.63636364 0.90909091 1. 0.90909091] mean value: 0.8363636363636363 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.66818182 0.66363636 0.81363636 0.85454545 1. 0.95454545 0.76818182 0.95454545 0.95 0.95454545] mean value: 0.8581818181818182 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.46153846 0.69230769 0.72727273 1. 0.90909091 0.58333333 0.90909091 0.91666667 0.90909091] mean value: 0.7608391608391608 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01380777 0.01287603 0.01335287 0.01323295 0.01333666 0.0131979 0.01330137 0.01331997 0.01724935 0.01322055] mean value: 0.013689541816711425 key: score_time value: [0.01226544 0.01245737 0.01221156 0.01228142 0.01906157 0.01224637 0.01229739 0.01220536 0.02101064 0.01225805] mean value: 0.013829517364501952 key: test_mcc value: [0.23636364 0.33636364 0.33028913 0.62641448 0.62641448 0.71818182 0.35527986 0.82572282 0.63305416 0.52727273] mean value: 0.52153567593267 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.66666667 0.66666667 0.80952381 0.80952381 0.85714286 0.66666667 0.9047619 0.80952381 0.76190476] mean value: 0.7571428571428571 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.6 0.66666667 0.63157895 0.77777778 0.77777778 0.85714286 0.63157895 0.9 0.8 0.76190476] mean value: 0.7404427736006683 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.6 0.63636364 0.66666667 0.875 0.875 0.9 0.75 1. 0.88888889 0.8 ] mean value: 0.7991919191919192 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.7 0.6 0.7 0.7 0.81818182 0.54545455 0.81818182 0.72727273 0.72727273] mean value: 0.6936363636363636 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61818182 0.66818182 0.66363636 0.80454545 0.80454545 0.85909091 0.67272727 0.90909091 0.81363636 0.76363636] mean value: 0.7577272727272728 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.42857143 0.5 0.46153846 0.63636364 0.63636364 0.75 0.46153846 0.81818182 0.66666667 0.61538462] mean value: 0.5974608724608724 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.89492655 1.78066468 1.30577731 1.62279463 1.80648351 2.03026009 1.99937129 1.9573288 1.32665277 1.25374627] mean value: 1.697800588607788 key: score_time value: [0.13693285 0.12314796 0.09956264 0.12987232 0.12343669 0.12388182 0.15492225 0.12438631 0.09179854 0.09004569] mean value: 0.11979870796203614 key: test_mcc value: [0.23636364 0.58630197 0.63305416 0.80909091 1. 1. 0.55161872 0.90909091 0.90909091 0.90909091] mean value: 0.7543702131757848 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.76190476 0.80952381 0.9047619 1. 1. 0.76190476 0.95238095 0.95238095 0.95238095] mean value: 0.8714285714285714 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.6 0.66666667 0.81818182 0.9 1. 1. 0.73684211 0.95238095 0.95238095 0.95238095] mean value: 0.8578833447254499 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.6 1. 0.75 0.9 1. 1. 0.875 1. 1. 1. ] mean value: 0.9125 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.5 0.9 0.9 1. 1. 0.63636364 0.90909091 0.90909091 0.90909091] mean value: 0.8263636363636364 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61818182 0.75 0.81363636 0.90454545 1. 1. 0.76818182 0.95454545 0.95454545 0.95454545] mean value: 0.8718181818181818 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.42857143 0.5 0.69230769 0.81818182 1. 1. 0.58333333 0.90909091 0.90909091 0.90909091] mean value: 0.7749666999667 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.88374734 0.898525 0.88748789 0.90992236 0.87181115 0.90699387 0.91898417 0.8755424 0.88821411 0.94180608] mean value: 0.8983034372329712 key: score_time value: [0.13070011 0.15602064 0.2069819 0.17687368 0.21706963 0.24573994 0.12457681 0.16810298 0.19927859 0.16002679] mean value: 0.17853710651397706 key: test_mcc value: [0.52295779 0.24120908 0.4719399 0.62641448 0.90829511 0.90829511 0.63305416 0.90909091 0.71562645 0.82572282] mean value: 0.6762605808801134 key: train_mcc value: [0.95767077 0.95767077 0.95788064 0.95767077 0.96830553 0.95789003 0.98947368 0.96830907 0.96830907 0.96830907] mean value: 0.9651489409652466 key: test_accuracy value: [0.76190476 0.61904762 0.71428571 0.80952381 0.95238095 0.95238095 0.80952381 0.95238095 0.85714286 0.9047619 ] mean value: 0.8333333333333333 key: train_accuracy value: [0.97883598 0.97883598 0.97883598 0.97883598 0.98412698 0.97883598 0.99470899 0.98412698 0.98412698 0.98412698] mean value: 0.9825396825396825 key: test_fscore value: [0.73684211 0.5 0.75 0.77777778 0.94736842 0.95652174 0.8 0.95238095 0.86956522 0.9 ] mean value: 0.8190456212996259 key: train_fscore value: [0.97894737 0.97894737 0.97916667 0.97894737 0.98429319 0.97894737 0.99470899 0.98412698 0.98412698 0.98412698] mean value: 0.9826339281158102 key: test_precision value: [0.77777778 0.66666667 0.64285714 0.875 1. 0.91666667 0.88888889 1. 0.83333333 1. ] mean value: 0.8601190476190477 key: train_precision value: [0.97894737 0.97894737 0.96907216 0.97894737 0.97916667 0.96875 0.98947368 0.97894737 0.97894737 0.97894737] mean value: 0.9780146726351963 key: test_recall value: [0.7 0.4 0.9 0.7 0.9 1. 0.72727273 0.90909091 0.90909091 0.81818182] mean value: 0.7963636363636364 key: train_recall value: [0.97894737 0.97894737 0.98947368 0.97894737 0.98947368 0.9893617 1. 0.9893617 0.9893617 0.9893617 ] mean value: 0.9873236282194849 key: test_roc_auc value: [0.75909091 0.60909091 0.72272727 0.80454545 0.95 0.95 0.81363636 0.95454545 0.85454545 0.90909091] mean value: 0.8327272727272728 key: train_roc_auc value: [0.97883539 0.97883539 0.9787794 0.97883539 0.98409854 0.97889138 0.99473684 0.98415454 0.98415454 0.98415454] mean value: 0.9825475923852184 key: test_jcc value: [0.58333333 0.33333333 0.6 0.63636364 0.9 0.91666667 0.66666667 0.90909091 0.76923077 0.81818182] mean value: 0.7132867132867133 key: train_jcc value: [0.95876289 0.95876289 0.95918367 0.95876289 0.96907216 0.95876289 0.98947368 0.96875 0.96875 0.96875 ] mean value: 0.9659031069020121 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02410841 0.00991416 0.01022744 0.01032209 0.00959754 0.01036572 0.00976348 0.00933981 0.01035428 0.01001096] mean value: 0.011400389671325683 key: score_time value: [0.00971651 0.00905752 0.00934982 0.00952864 0.00984573 0.00897074 0.00895166 0.00975013 0.00975347 0.00914025] mean value: 0.009406447410583496 key: test_mcc value: [ 0.24771685 -0.08528029 0.26967994 0.43007562 0.45226702 0.24771685 0.15894099 0.33028913 0.13762047 0.30914104] mean value: 0.24981676122162266 key: train_mcc value: [0.47383838 0.43945337 0.49511046 0.41111248 0.4606251 0.46213311 0.45044462 0.43913092 0.44972004 0.47093091] mean value: 0.45524993919694523 key: test_accuracy value: [0.61904762 0.47619048 0.61904762 0.71428571 0.71428571 0.61904762 0.57142857 0.66666667 0.57142857 0.61904762] mean value: 0.6190476190476191 key: train_accuracy value: [0.73544974 0.71957672 0.74603175 0.7037037 0.73015873 0.73015873 0.72486772 0.71957672 0.72486772 0.73544974] mean value: 0.726984126984127 key: test_fscore value: [0.63636364 0.26666667 0.66666667 0.66666667 0.625 0.6 0.52631579 0.69565217 0.60869565 0.5 ] mean value: 0.5792027251924277 key: train_fscore value: [0.72222222 0.71657754 0.73333333 0.68539326 0.72727273 0.7150838 0.71428571 0.71657754 0.72340426 0.7311828 ] mean value: 0.7185333185655622 key: test_precision value: [0.58333333 0.4 0.57142857 0.75 0.83333333 0.66666667 0.625 0.66666667 0.58333333 0.8 ] mean value: 0.6479761904761905 key: train_precision value: [0.76470588 0.72826087 0.77647059 0.73493976 0.73913043 0.75294118 0.73863636 0.72043011 0.72340426 0.73913043] mean value: 0.7418049871707797 key: test_recall value: [0.7 0.2 0.8 0.6 0.5 0.54545455 0.45454545 0.72727273 0.63636364 0.36363636] mean value: 0.5527272727272727 key: train_recall value: [0.68421053 0.70526316 0.69473684 0.64210526 0.71578947 0.68085106 0.69148936 0.71276596 0.72340426 0.72340426] mean value: 0.6974020156774916 key: test_roc_auc value: [0.62272727 0.46363636 0.62727273 0.70909091 0.70454545 0.62272727 0.57727273 0.66363636 0.56818182 0.63181818] mean value: 0.6190909090909091 key: train_roc_auc value: [0.73572228 0.71965286 0.74630459 0.70403135 0.73023516 0.72989922 0.72469205 0.71954087 0.72486002 0.73538634] mean value: 0.7270324748040313 key: test_jcc value: [0.46666667 0.15384615 0.5 0.5 0.45454545 0.42857143 0.35714286 0.53333333 0.4375 0.33333333] mean value: 0.4164939227439227 key: train_jcc value: [0.56521739 0.55833333 0.57894737 0.52136752 0.57142857 0.55652174 0.55555556 0.55833333 0.56666667 0.57627119] mean value: 0.5608642666981495 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [1.47359419 1.51395917 1.40394354 1.49806833 0.73248506 0.14855814 0.12012458 1.29104447 0.43357348 0.96943855] mean value: 0.9584789514541626 key: score_time value: [0.01303411 0.01369548 0.01240373 0.02015805 0.01262212 0.01179838 0.01309061 0.01261091 0.01320601 0.01363063] mean value: 0.013625001907348633 key: test_mcc value: [0.82275335 0.90829511 0.63305416 1. 1. 1. 0.71562645 1. 1. 0.82572282] mean value: 0.8905451893561251 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.95238095 0.80952381 1. 1. 1. 0.85714286 1. 1. 0.9047619 ] mean value: 0.9428571428571428 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.94736842 0.81818182 1. 1. 1. 0.86956522 1. 1. 0.9 ] mean value: 0.9424004345514643 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.75 1. 1. 1. 0.83333333 1. 1. 1. ] mean value: 0.9583333333333334 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.9 0.9 1. 1. 1. 0.90909091 1. 1. 0.81818182] mean value: 0.9327272727272727 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.95 0.81363636 1. 1. 1. 0.85454545 1. 1. 0.90909091] mean value: 0.9427272727272727 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.9 0.69230769 1. 1. 1. 0.76923077 1. 1. 0.81818182] mean value: 0.897972027972028 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.04121399 0.07725215 0.07548904 0.03828049 0.06962919 0.06137753 0.04697084 0.05984068 0.04247427 0.06800699] mean value: 0.05805351734161377 key: score_time value: [0.01933718 0.0345602 0.01221371 0.01211905 0.0219357 0.01210523 0.02289724 0.01555729 0.02384901 0.02143526] mean value: 0.019600987434387207 key: test_mcc value: [0.53935989 0.44038551 0.53935989 0.80909091 0.74161985 0.82572282 0.67419986 0.90909091 0.82572282 0.53300179] mean value: 0.6837554253924925 key: train_mcc value: [0.97883539 0.94757483 0.98947251 0.97905701 0.97905701 0.96830553 0.95788064 0.95788064 0.92637852 0.95788064] mean value: 0.9642322713040267 key: test_accuracy value: [0.76190476 0.71428571 0.76190476 0.9047619 0.85714286 0.9047619 0.80952381 0.95238095 0.9047619 0.71428571] mean value: 0.8285714285714285 key: train_accuracy value: [0.98941799 0.97354497 0.99470899 0.98941799 0.98941799 0.98412698 0.97883598 0.97883598 0.96296296 0.97883598] mean value: 0.982010582010582 key: test_fscore value: [0.70588235 0.72727273 0.70588235 0.9 0.82352941 0.9 0.77777778 0.95238095 0.9 0.625 ] mean value: 0.8017725575078516 key: train_fscore value: [0.98947368 0.97326203 0.9947644 0.9893617 0.9893617 0.98395722 0.97849462 0.97849462 0.96216216 0.97849462] mean value: 0.9817826770838407 key: test_precision value: [0.85714286 0.66666667 0.85714286 0.9 1. 1. 1. 1. 1. 1. ] mean value: 0.9280952380952381 key: train_precision value: [0.98947368 0.98913043 0.98958333 1. 1. 0.98924731 0.98913043 0.98913043 0.97802198 0.98913043] mean value: 0.990284804652423 key: test_recall value: [0.6 0.8 0.6 0.9 0.7 0.81818182 0.63636364 0.90909091 0.81818182 0.45454545] mean value: 0.7236363636363636 key: train_recall value: [0.98947368 0.95789474 1. 0.97894737 0.97894737 0.9787234 0.96808511 0.96808511 0.94680851 0.96808511] mean value: 0.973505039193729 key: test_roc_auc value: [0.75454545 0.71818182 0.75454545 0.90454545 0.85 0.90909091 0.81818182 0.95454545 0.90909091 0.72727273] mean value: 0.83 key: train_roc_auc value: [0.98941769 0.97362822 0.99468085 0.98947368 0.98947368 0.98409854 0.9787794 0.9787794 0.96287794 0.9787794 ] mean value: 0.9819988801791713 key: test_jcc value: [0.54545455 0.57142857 0.54545455 0.81818182 0.7 0.81818182 0.63636364 0.90909091 0.81818182 0.45454545] mean value: 0.6816883116883117 key: train_jcc value: [0.97916667 0.94791667 0.98958333 0.97894737 0.97894737 0.96842105 0.95789474 0.95789474 0.92708333 0.95789474] mean value: 0.964375 MCC on Blind test: 0.11 Accuracy on Blind test: 0.53 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02042866 0.01330423 0.01539516 0.01460958 0.01472521 0.0147078 0.01172614 0.01205611 0.00887656 0.00874543] mean value: 0.013457489013671876 key: score_time value: [0.02030158 0.01348376 0.01388836 0.01446819 0.01431298 0.01398635 0.01170921 0.00890064 0.00833917 0.00834632] mean value: 0.012773656845092773 key: test_mcc value: [ 0.33028913 -0.05504819 0.08528029 0.42727273 0.55161872 0.62641448 0.23636364 0.24120908 0.02312486 0.55161872] mean value: 0.30181434631411985 key: train_mcc value: [0.39005594 0.40281841 0.35974476 0.37994444 0.43065616 0.40240809 0.44248737 0.47825095 0.44248737 0.39243141] mean value: 0.41212849003171664 key: test_accuracy value: [0.66666667 0.47619048 0.52380952 0.71428571 0.76190476 0.80952381 0.61904762 0.61904762 0.52380952 0.76190476] mean value: 0.6476190476190476 key: train_accuracy value: [0.69312169 0.6984127 0.67724868 0.68783069 0.71428571 0.6984127 0.71957672 0.73544974 0.71957672 0.69312169] mean value: 0.7037037037037037 key: test_fscore value: [0.63157895 0.42105263 0.61538462 0.7 0.7826087 0.83333333 0.63636364 0.69230769 0.64285714 0.73684211] mean value: 0.669232880010912 key: train_fscore value: [0.71568627 0.72463768 0.70531401 0.71219512 0.73 0.71921182 0.73366834 0.75490196 0.73366834 0.71568627] mean value: 0.7244969828653581 key: test_precision value: [0.66666667 0.44444444 0.5 0.7 0.69230769 0.76923077 0.63636364 0.6 0.52941176 0.875 ] mean value: 0.6413424973719091 key: train_precision value: [0.66972477 0.66964286 0.65178571 0.66363636 0.6952381 0.66972477 0.6952381 0.7 0.6952381 0.66363636] mean value: 0.6773865125699988 key: test_recall value: [0.6 0.4 0.8 0.7 0.9 0.90909091 0.63636364 0.81818182 0.81818182 0.63636364] mean value: 0.7218181818181818 key: train_recall value: [0.76842105 0.78947368 0.76842105 0.76842105 0.76842105 0.77659574 0.77659574 0.81914894 0.77659574 0.77659574] mean value: 0.7788689809630459 key: test_roc_auc value: [0.66363636 0.47272727 0.53636364 0.71363636 0.76818182 0.80454545 0.61818182 0.60909091 0.50909091 0.76818182] mean value: 0.6463636363636364 key: train_roc_auc value: [0.69272116 0.69792833 0.67676372 0.68740202 0.71399776 0.69882419 0.71987682 0.73589026 0.71987682 0.69356103] mean value: 0.7036842105263158 key: test_jcc value: [0.46153846 0.26666667 0.44444444 0.53846154 0.64285714 0.71428571 0.46666667 0.52941176 0.47368421 0.58333333] mean value: 0.5121349943486166 key: train_jcc value: [0.55725191 0.56818182 0.54477612 0.5530303 0.57480315 0.56153846 0.57936508 0.60629921 0.57936508 0.55725191] mean value: 0.5681863039882344 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01249909 0.01643276 0.01513815 0.01626801 0.01667428 0.01826024 0.01673746 0.01813006 0.0181365 0.01722217] mean value: 0.01654987335205078 key: score_time value: [0.00913095 0.01083755 0.01164222 0.01167583 0.01165843 0.01161766 0.01240993 0.01225019 0.01175737 0.01168919] mean value: 0.01146693229675293 key: test_mcc value: [0.30914104 0.45226702 0.71562645 0.67419986 0.58630197 0.60302269 0.55161872 0.82275335 0.74795759 0.67419986] mean value: 0.6137088554293553 key: train_mcc value: [0.80642655 0.76291765 0.83076702 0.58655527 0.76291765 0.82445214 0.94713854 0.93736014 0.91553719 0.93837953] mean value: 0.8312451676284541 key: test_accuracy value: [0.61904762 0.71428571 0.85714286 0.80952381 0.76190476 0.76190476 0.76190476 0.9047619 0.85714286 0.80952381] mean value: 0.7857142857142857 key: train_accuracy value: [0.89417989 0.86772487 0.91005291 0.75661376 0.86772487 0.9047619 0.97354497 0.96825397 0.95767196 0.96825397] mean value: 0.9068783068783068 key: test_fscore value: [0.69230769 0.625 0.84210526 0.83333333 0.66666667 0.70588235 0.73684211 0.91666667 0.84210526 0.77777778] mean value: 0.7638687121272261 key: train_fscore value: [0.9047619 0.84848485 0.90285714 0.80508475 0.84848485 0.89411765 0.97326203 0.96875 0.95698925 0.96703297] mean value: 0.9069825383840636 key: test_precision value: [0.5625 0.83333333 0.88888889 0.71428571 1. 1. 0.875 0.84615385 1. 1. ] mean value: 0.8720161782661783 key: train_precision value: [0.82608696 1. 0.9875 0.67375887 1. 1. 0.97849462 0.94897959 0.9673913 1. ] mean value: 0.9382211341610441 key: test_recall value: [0.9 0.5 0.8 1. 0.5 0.54545455 0.63636364 1. 0.72727273 0.63636364] mean value: 0.7245454545454546 key: train_recall value: [1. 0.73684211 0.83157895 1. 0.73684211 0.80851064 0.96808511 0.9893617 0.94680851 0.93617021] mean value: 0.8954199328107503 key: test_roc_auc value: [0.63181818 0.70454545 0.85454545 0.81818182 0.75 0.77272727 0.76818182 0.9 0.86363636 0.81818182] mean value: 0.7881818181818182 key: train_roc_auc value: [0.89361702 0.86842105 0.91047032 0.75531915 0.86842105 0.90425532 0.97351624 0.96836506 0.95761478 0.96808511] mean value: 0.9068085106382979 key: test_jcc value: [0.52941176 0.45454545 0.72727273 0.71428571 0.5 0.54545455 0.58333333 0.84615385 0.72727273 0.63636364] mean value: 0.6264093749387867 key: train_jcc value: [0.82608696 0.73684211 0.82291667 0.67375887 0.73684211 0.80851064 0.94791667 0.93939394 0.91752577 0.93617021] mean value: 0.834596392928326 MCC on Blind test: 0.49 Accuracy on Blind test: 0.73 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01460838 0.01574397 0.01721883 0.01524639 0.03197598 0.01554775 0.01560473 0.01410699 0.01534986 0.01558042] mean value: 0.017098331451416017 key: score_time value: [0.01211929 0.01337838 0.01466918 0.01597857 0.02657938 0.01167512 0.01169777 0.0117774 0.01175189 0.01166606] mean value: 0.014129304885864257 key: test_mcc value: [0.26967994 0.53300179 0.62641448 0.58630197 0.90829511 0.38924947 0.55161872 0.50874702 0.80909091 0.62641448] mean value: 0.580881390303866 key: train_mcc value: [0.82785245 0.48764459 0.69501809 0.53983361 0.93841972 0.41041408 0.87061974 0.48948681 0.79048128 0.63728115] mean value: 0.668705151901458 key: test_accuracy value: [0.61904762 0.71428571 0.80952381 0.76190476 0.95238095 0.61904762 0.76190476 0.71428571 0.9047619 0.80952381] mean value: 0.7666666666666666 key: train_accuracy value: [0.91005291 0.69312169 0.82539683 0.72486772 0.96825397 0.64550265 0.93121693 0.6984127 0.88888889 0.78835979] mean value: 0.8074074074074074 key: test_fscore value: [0.66666667 0.76923077 0.77777778 0.66666667 0.94736842 0.42857143 0.73684211 0.78571429 0.90909091 0.83333333] mean value: 0.7521262363367627 key: train_fscore value: [0.91625616 0.76612903 0.78980892 0.62318841 0.9673913 0.44628099 0.92571429 0.7654321 0.87719298 0.8245614 ] mean value: 0.790195557941608 key: test_precision value: [0.57142857 0.625 0.875 1. 1. 1. 0.875 0.64705882 0.90909091 0.76923077] mean value: 0.8271809073279661 key: train_precision value: [0.86111111 0.62091503 1. 1. 1. 1. 1. 0.62416107 0.97402597 0.70149254] mean value: 0.878170572895576 key: test_recall value: [0.8 1. 0.7 0.5 0.9 0.27272727 0.63636364 1. 0.90909091 0.90909091] mean value: 0.7627272727272727 key: train_recall value: [0.97894737 1. 0.65263158 0.45263158 0.93684211 0.28723404 0.86170213 0.9893617 0.79787234 1. ] mean value: 0.7957222844344904 key: test_roc_auc value: [0.62727273 0.72727273 0.80454545 0.75 0.95 0.63636364 0.76818182 0.7 0.90454545 0.80454545] mean value: 0.7672727272727273 key: train_roc_auc value: [0.90968645 0.69148936 0.82631579 0.72631579 0.96842105 0.64361702 0.93085106 0.69994401 0.88840985 0.78947368] mean value: 0.8074524076147817 key: test_jcc value: [0.5 0.625 0.63636364 0.5 0.9 0.27272727 0.58333333 0.64705882 0.83333333 0.71428571] mean value: 0.6212102113572702 key: train_jcc value: [0.84545455 0.62091503 0.65263158 0.45263158 0.93684211 0.28723404 0.86170213 0.62 0.78125 0.70149254] mean value: 0.6760153548818377 MCC on Blind test: 0.49 Accuracy on Blind test: 0.73 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.12339163 0.15838337 0.13479781 0.10761476 0.11238742 0.11786103 0.11912537 0.12101555 0.1190908 0.16042662] mean value: 0.12740943431854249 key: score_time value: [0.01511359 0.02331877 0.01539397 0.01508164 0.01590371 0.01642942 0.01661968 0.01648045 0.01624656 0.02382159] mean value: 0.017440938949584962 key: test_mcc value: [0.80909091 0.82275335 0.52727273 0.90829511 1. 0.90909091 0.71818182 0.82572282 1. 0.90909091] mean value: 0.8429498554008733 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.9047619 0.76190476 0.95238095 1. 0.95238095 0.85714286 0.9047619 1. 0.95238095] mean value: 0.919047619047619 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.9 0.88888889 0.76190476 0.94736842 1. 0.95238095 0.85714286 0.9 1. 0.95238095] mean value: 0.9160066833751045 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.9 1. 0.72727273 1. 1. 1. 0.9 1. 1. 1. ] mean value: 0.9527272727272728 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 0.8 0.8 0.9 1. 0.90909091 0.81818182 0.81818182 1. 0.90909091] mean value: 0.8854545454545455 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.90454545 0.9 0.76363636 0.95 1. 0.95454545 0.85909091 0.90909091 1. 0.95454545] mean value: 0.9195454545454546 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.81818182 0.8 0.61538462 0.9 1. 0.90909091 0.75 0.81818182 1. 0.90909091] mean value: 0.8519930069930071 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.03891611 0.03142166 0.04908037 0.03901887 0.03456759 0.02894402 0.03260779 0.03758121 0.04022551 0.04435349] mean value: 0.037671661376953124 key: score_time value: [0.01717496 0.01689577 0.02591538 0.02410626 0.01824427 0.02229643 0.02497411 0.02035999 0.02795792 0.01740479] mean value: 0.021532988548278807 key: test_mcc value: [0.74161985 0.90829511 0.71818182 0.82275335 0.90829511 1. 0.80909091 1. 1. 0.90909091] mean value: 0.881732704873914 key: train_mcc value: [0.97905701 0.97883539 0.98947368 0.98947368 1. 0.98947251 1. 0.97905237 1. 0.98947251] mean value: 0.9894837157705416 key: test_accuracy value: [0.85714286 0.95238095 0.85714286 0.9047619 0.95238095 1. 0.9047619 1. 1. 0.95238095] mean value: 0.9380952380952381 key: train_accuracy value: [0.98941799 0.98941799 0.99470899 0.99470899 1. 0.99470899 1. 0.98941799 1. 0.99470899] mean value: 0.9947089947089947 key: test_fscore value: [0.82352941 0.94736842 0.85714286 0.88888889 0.94736842 1. 0.90909091 1. 1. 0.95238095] mean value: 0.9325769861373576 key: train_fscore value: [0.9893617 0.98947368 0.99470899 0.99470899 1. 0.99465241 1. 0.98924731 1. 0.99465241] mean value: 0.9946805500418356 key: test_precision value: [1. 1. 0.81818182 1. 1. 1. 0.90909091 1. 1. 1. ] mean value: 0.9727272727272728 key: train_precision value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 key: test_recall value: [0.7 0.9 0.9 0.8 0.9 1. 0.90909091 1. 1. 0.90909091] mean value: 0.9018181818181819 key: train_recall value: [0.97894737 0.98947368 0.98947368 0.98947368 1. 0.9893617 1. 0.9787234 1. 0.9893617 ] mean value: 0.990481522956327 key: test_roc_auc value: [0.85 0.95 0.85909091 0.9 0.95 1. 0.90454545 1. 1. 0.95454545] mean value: 0.9368181818181818 key: train_roc_auc value: [0.98947368 0.98941769 0.99473684 0.99473684 1. 0.99468085 1. 0.9893617 1. 0.99468085] mean value: 0.9947088465845465 key: test_jcc value: [0.7 0.9 0.75 0.8 0.9 1. 0.83333333 1. 1. 0.90909091] mean value: 0.8792424242424243 key: train_jcc value: [0.97894737 0.97916667 0.98947368 0.98947368 1. 0.9893617 1. 0.9787234 1. 0.9893617 ] mean value: 0.989450821201941 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.06760859 0.46913624 0.26768494 0.08426118 0.09855819 0.09934545 0.07997847 0.09252858 0.13018012 0.09178209] mean value: 0.14810638427734374 key: score_time value: [0.02587819 0.01851845 0.02169013 0.02464533 0.02099586 0.02412891 0.02215743 0.02377272 0.03352404 0.0315032 ] mean value: 0.02468142509460449 key: test_mcc value: [0.13762047 0.62641448 0.52727273 0.52295779 0.82275335 0.52727273 0.39196475 0.82572282 0.4719399 0.44038551] mean value: 0.5294304529705056 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.57142857 0.80952381 0.76190476 0.76190476 0.9047619 0.76190476 0.66666667 0.9047619 0.71428571 0.71428571] mean value: 0.7571428571428571 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.52631579 0.77777778 0.76190476 0.73684211 0.88888889 0.76190476 0.58823529 0.9 0.66666667 0.7 ] mean value: 0.7308536045997346 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.875 0.72727273 0.77777778 1. 0.8 0.83333333 1. 0.85714286 0.77777778] mean value: 0.8203860028860029 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.5 0.7 0.8 0.7 0.8 0.72727273 0.45454545 0.81818182 0.54545455 0.63636364] mean value: 0.6681818181818182 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.56818182 0.80454545 0.76363636 0.75909091 0.9 0.76363636 0.67727273 0.90909091 0.72272727 0.71818182] mean value: 0.7586363636363637 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.35714286 0.63636364 0.61538462 0.58333333 0.8 0.61538462 0.41666667 0.81818182 0.5 0.53846154] mean value: 0.5880919080919081 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.35438156 0.39279032 0.34204507 0.33374977 0.37280583 0.38470411 0.33873773 0.35585856 0.37735271 0.37621069] mean value: 0.3628636360168457 key: score_time value: [0.01215506 0.01001287 0.00964713 0.00953436 0.01329851 0.00969267 0.0101738 0.01015115 0.01490641 0.00933409] mean value: 0.010890603065490723 key: test_mcc value: [0.82275335 0.90829511 0.82572282 1. 1. 1. 0.80909091 1. 1. 0.90909091] mean value: 0.9274953099463279 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.95238095 0.9047619 1. 1. 1. 0.9047619 1. 1. 0.95238095] mean value: 0.9619047619047619 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.94736842 0.90909091 1. 1. 1. 0.90909091 1. 1. 0.95238095] mean value: 0.9606820080504291 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 0.83333333 1. 1. 1. 0.90909091 1. 1. 1. ] mean value: 0.9742424242424242 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.9 1. 1. 1. 1. 0.90909091 1. 1. 0.90909091] mean value: 0.9518181818181818 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.95 0.90909091 1. 1. 1. 0.90454545 1. 1. 0.95454545] mean value: 0.9618181818181818 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.9 0.83333333 1. 1. 1. 0.83333333 1. 1. 0.90909091] mean value: 0.9275757575757576 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.05870724 0.03189158 0.03561378 0.02327633 0.02330828 0.02847648 0.0224936 0.02254534 0.03917503 0.02338982] mean value: 0.030887746810913087 key: score_time value: [0.01973987 0.01692939 0.01304388 0.01248932 0.01487756 0.01249003 0.01517224 0.01476288 0.0195353 0.01688313] mean value: 0.015592360496520996 key: test_mcc value: [0.38924947 0.46249729 0.60302269 0.53300179 0.82572282 0.74161985 0.74161985 0.90829511 0.82275335 0.74161985] mean value: 0.6769402069598355 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.66666667 0.76190476 0.71428571 0.9047619 0.85714286 0.85714286 0.95238095 0.9047619 0.85714286] mean value: 0.8095238095238095 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.71428571 0.74074074 0.8 0.76923077 0.90909091 0.88 0.88 0.95652174 0.91666667 0.88 ] mean value: 0.8446536539145235 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.58823529 0.66666667 0.625 0.83333333 0.78571429 0.78571429 0.91666667 0.84615385 0.78571429] mean value: 0.7388754219636573 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.63636364 0.68181818 0.77272727 0.72727273 0.90909091 0.85 0.85 0.95 0.9 0.85 ] mean value: 0.8127272727272727 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.55555556 0.58823529 0.66666667 0.625 0.83333333 0.78571429 0.78571429 0.91666667 0.84615385 0.78571429] mean value: 0.7388754219636573 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.0 Accuracy on Blind test: 0.6 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.05089164 0.02581358 0.02874446 0.03665304 0.03879476 0.03609467 0.0341928 0.01681209 0.01631188 0.01475215] mean value: 0.029906105995178223 key: score_time value: [0.03134346 0.01229501 0.02358603 0.0211916 0.02092743 0.0194459 0.01217341 0.0122261 0.01211548 0.01222587] mean value: 0.017753028869628908 key: test_mcc value: [0.71562645 0.71562645 0.33636364 1. 0.80909091 0.63305416 0.55161872 0.90829511 0.82572282 0.71818182] mean value: 0.7213580077427297 key: train_mcc value: [0.96830907 0.93736014 0.93670891 0.9264031 0.92597156 0.94714446 0.95767077 0.95789003 0.94713854 0.93672304] mean value: 0.9441319626080061 key: test_accuracy value: [0.85714286 0.85714286 0.66666667 1. 0.9047619 0.80952381 0.76190476 0.95238095 0.9047619 0.85714286] mean value: 0.8571428571428571 key: train_accuracy value: [0.98412698 0.96825397 0.96825397 0.96296296 0.96296296 0.97354497 0.97883598 0.97883598 0.97354497 0.96825397] mean value: 0.9719576719576719 key: test_fscore value: [0.84210526 0.84210526 0.66666667 1. 0.9 0.8 0.73684211 0.95652174 0.9 0.85714286] mean value: 0.8501383894518906 key: train_fscore value: [0.98412698 0.96774194 0.96875 0.96256684 0.96335079 0.97354497 0.9787234 0.97894737 0.97326203 0.96842105] mean value: 0.9719435380809441 key: test_precision value: [0.88888889 0.88888889 0.63636364 1. 0.9 0.88888889 0.875 0.91666667 1. 0.9 ] mean value: 0.8894696969696969 key: train_precision value: [0.9893617 0.98901099 0.95876289 0.97826087 0.95833333 0.96842105 0.9787234 0.96875 0.97849462 0.95833333] mean value: 0.9726452194511284 key: test_recall value: [0.8 0.8 0.7 1. 0.9 0.72727273 0.63636364 1. 0.81818182 0.81818182] mean value: 0.8200000000000001 key: train_recall value: [0.97894737 0.94736842 0.97894737 0.94736842 0.96842105 0.9787234 0.9787234 0.9893617 0.96808511 0.9787234 ] mean value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:148: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:151: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) 0.9714669652855543 key: test_roc_auc value: [0.85454545 0.85454545 0.66818182 1. 0.90454545 0.81363636 0.76818182 0.95 0.90909091 0.85909091] mean value: 0.8581818181818182 key: train_roc_auc value: [0.98415454 0.96836506 0.96819709 0.96304591 0.96293393 0.97357223 0.97883539 0.97889138 0.97351624 0.96830907] mean value: 0.9719820828667414 key: test_jcc value: [0.72727273 0.72727273 0.5 1. 0.81818182 0.66666667 0.58333333 0.91666667 0.81818182 0.75 ] mean value: 0.7507575757575757 key: train_jcc value: [0.96875 0.9375 0.93939394 0.92783505 0.92929293 0.94845361 0.95833333 0.95876289 0.94791667 0.93877551] mean value: 0.9455013925282703 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.32369375 0.40332413 0.32787657 0.30810881 0.24554396 0.37028098 0.35776758 0.43476796 0.61309886 0.40438581] mean value: 0.37888484001159667 key: score_time value: [0.04028463 0.04569244 0.02587414 0.01976562 0.02221918 0.01939678 0.02430534 0.04026508 0.05725694 0.02162313] mean value: 0.0316683292388916 key: test_mcc value: [0.62641448 0.61818182 0.53935989 0.90909091 0.90829511 0.71818182 0.74795759 1. 0.80909091 0.71818182] mean value: 0.7594754344239546 key: train_mcc value: [0.98947251 0.95789003 0.95767077 0.94714446 0.95767077 0.94714446 0.95767077 0.94713854 0.92597984 0.93672304] mean value: 0.9524505200298267 key: test_accuracy value: [0.80952381 0.80952381 0.76190476 0.95238095 0.95238095 0.85714286 0.85714286 1. 0.9047619 0.85714286] mean value: 0.8761904761904762 key: train_accuracy value: [0.99470899 0.97883598 0.97883598 0.97354497 0.97883598 0.97354497 0.97883598 0.97354497 0.96296296 0.96825397] mean value: 0.9761904761904762 key: test_fscore value: [0.77777778 0.8 0.70588235 0.95238095 0.94736842 0.85714286 0.84210526 1. 0.90909091 0.85714286] mean value: 0.8648891390687057 key: train_fscore value: [0.9947644 0.9787234 0.97894737 0.97354497 0.97894737 0.97354497 0.9787234 0.97326203 0.96296296 0.96842105] mean value: 0.9761841938028554 key: test_precision value: [0.875 0.8 0.85714286 0.90909091 1. 0.9 1. 1. 0.90909091 0.9 ] mean value: 0.9150324675324675 key: train_precision value: [0.98958333 0.98924731 0.97894737 0.9787234 0.97894737 0.96842105 0.9787234 0.97849462 0.95789474 0.95833333] mean value: 0.9757315936976966 key: test_recall value: [0.7 0.8 0.6 1. 0.9 0.81818182 0.72727273 1. 0.90909091 0.81818182] mean value: 0.8272727272727273 key: train_recall value: [1. 0.96842105 0.97894737 0.96842105 0.97894737 0.9787234 0.9787234 0.96808511 0.96808511 0.9787234 ] mean value: 0.9767077267637179 key: test_roc_auc value: [0.80454545 0.80909091 0.75454545 0.95454545 0.95 0.85909091 0.86363636 1. 0.90454545 0.85909091] mean value: 0.8759090909090909 key: train_roc_auc value: [0.99468085 0.97889138 0.97883539 0.97357223 0.97883539 0.97357223 0.97883539 0.97351624 0.96298992 0.96830907] mean value: 0.9762038073908175 key: test_jcc value: [0.63636364 0.66666667 0.54545455 0.90909091 0.9 0.75 0.72727273 1. 0.83333333 0.75 ] mean value: 0.7718181818181818 key: train_jcc value: [0.98958333 0.95833333 0.95876289 0.94845361 0.95876289 0.94845361 0.95833333 0.94791667 0.92857143 0.93877551] mean value: 0.9535946595132898 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.07115984 0.11514401 0.103935 0.06160641 0.0467484 0.14352918 0.13648367 0.05514574 0.06950617 0.10242081] mean value: 0.09056792259216309 key: score_time value: [0.02112889 0.01284909 0.02215528 0.02116251 0.02008915 0.0174675 0.0115118 0.03403592 0.04912376 0.01486492] mean value: 0.022438883781433105 key: test_mcc value: [ 0.41475753 0.54761905 0.73192505 0.41475753 0.07142857 0.73192505 0.28288947 0.38575837 0.41475753 -0.23809524] mean value: 0.3757722933220292 key: train_mcc value: [0.8120433 0.82904734 0.82904734 0.88144164 0.8120433 0.77888301 0.82904734 0.82958203 0.81310356 0.84732411] mean value: 0.8261562964430988 key: test_accuracy value: [0.69230769 0.76923077 0.84615385 0.69230769 0.53846154 0.84615385 0.61538462 0.69230769 0.69230769 0.38461538] mean value: 0.676923076923077 key: train_accuracy value: [0.90598291 0.91452991 0.91452991 0.94017094 0.90598291 0.88888889 0.91452991 0.91452991 0.90598291 0.92307692] mean value: 0.9128205128205128 key: test_fscore value: [0.71428571 0.76923077 0.85714286 0.71428571 0.5 0.83333333 0.54545455 0.75 0.66666667 0.42857143] mean value: 0.6778971028971029 key: train_fscore value: [0.90756303 0.91525424 0.91525424 0.94214876 0.90756303 0.8907563 0.9137931 0.91525424 0.90756303 0.92436975] mean value: 0.9139519701693681 key: test_precision value: [0.625 0.71428571 0.75 0.625 0.5 1. 0.75 0.66666667 0.8 0.42857143] mean value: 0.685952380952381 key: train_precision value: [0.9 0.91525424 0.91525424 0.91935484 0.9 0.86885246 0.9137931 0.9 0.8852459 0.90163934] mean value: 0.9019394121652258 key: test_recall value: [0.83333333 0.83333333 1. 0.83333333 0.5 0.71428571 0.42857143 0.85714286 0.57142857 0.42857143] mean value: 0.7 key: train_recall value: [0.91525424 0.91525424 0.91525424 0.96610169 0.91525424 0.9137931 0.9137931 0.93103448 0.93103448 0.94827586] mean value: 0.9265049678550555 key: test_roc_auc value: [0.70238095 0.77380952 0.85714286 0.70238095 0.53571429 0.85714286 0.63095238 0.67857143 0.70238095 0.38095238] mean value: 0.6821428571428572 key: train_roc_auc value: [0.90590298 0.91452367 0.91452367 0.9399474 0.90590298 0.88909994 0.91452367 0.91466978 0.90619521 0.92329047] mean value: 0.9128579777907656 key: test_jcc value: [0.55555556 0.625 0.75 0.55555556 0.33333333 0.71428571 0.375 0.6 0.5 0.27272727] mean value: 0.5281457431457431 key: train_jcc value: [0.83076923 0.84375 0.84375 0.890625 0.83076923 0.8030303 0.84126984 0.84375 0.83076923 0.859375 ] mean value: 0.8417857836607837 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [2.14313459 2.65574384 2.69484901 2.17424965 2.27060723 2.20656586 2.04485202 2.53665352 2.39897084 2.08672571] mean value: 2.3212352275848387 key: score_time value: [0.01857519 0.02521801 0.01422095 0.02029347 0.03888059 0.01733184 0.02459574 0.04596186 0.0188508 0.01879811] mean value: 0.024272656440734862 key: test_mcc value: [0.41475753 0.54761905 0.54761905 0.41475753 0.59160798 0.85714286 0.23809524 0.53674504 0.69047619 0.09759001] mean value: 0.49364104686851423 key: train_mcc value: [0.93218361 1. 1. 0.88144164 1. 0.8974284 1. 1. 1. 1. ] mean value: 0.9711053657014244 key: test_accuracy value: [0.69230769 0.76923077 0.76923077 0.69230769 0.76923077 0.92307692 0.61538462 0.76923077 0.84615385 0.53846154] mean value: 0.7384615384615385 key: train_accuracy value: [0.96581197 1. 1. 0.94017094 1. 0.94871795 1. 1. 1. 1. ] mean value: 0.9854700854700855 key: test_fscore value: [0.71428571 0.76923077 0.76923077 0.71428571 0.66666667 0.92307692 0.61538462 0.8 0.85714286 0.5 ] mean value: 0.7329304029304029 key: train_fscore value: [0.96551724 1. 1. 0.94214876 1. 0.94827586 1. 1. 1. 1. ] mean value: 0.9855941863778854 key: test_precision value: [0.625 0.71428571 0.71428571 0.625 1. 1. 0.66666667 0.75 0.85714286 0.6 ] mean value: 0.7552380952380953 key: train_precision value: [0.98245614 1. 1. 0.91935484 1. 0.94827586 1. 1. 1. 1. ] mean value: 0.985008684112952 key: test_recall value: [0.83333333 0.83333333 0.83333333 0.83333333 0.5 0.85714286 0.57142857 0.85714286 0.85714286 0.42857143] mean value: 0.7404761904761905 key: train_recall value: [0.94915254 1. 1. 0.96610169 1. 0.94827586 1. 1. 1. 1. ] mean value: 0.9863530099357101 key: test_roc_auc value: [0.70238095 0.77380952 0.77380952 0.70238095 0.75 0.92857143 0.61904762 0.76190476 0.8452381 0.54761905] mean value: 0.7404761904761905 key: train_roc_auc value: [0.96595558 1. 1. 0.9399474 1. 0.9487142 1. 1. 1. 1. ] mean value: 0.9854617182933957 key: test_jcc value: [0.55555556 0.625 0.625 0.55555556 0.5 0.85714286 0.44444444 0.66666667 0.75 0.33333333] mean value: 0.5912698412698413 key: train_jcc value: [0.93333333 1. 1. 0.890625 1. 0.90163934 1. 1. 1. 1. ] mean value: 0.9725597677595629 MCC on Blind test: 0.43 Accuracy on Blind test: 0.73 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.01936674 0.01057959 0.01035166 0.01044059 0.0103271 0.01035023 0.01044679 0.01057243 0.01038861 0.01044059] mean value: 0.011326432228088379 key: score_time value: [0.01023579 0.01007533 0.01012969 0.01011801 0.01018596 0.01004577 0.0100286 0.01010728 0.01014471 0.01001549] mean value: 0.010108661651611329 key: test_mcc value: [ 0.09759001 0.23809524 0.38095238 0.09759001 0.38095238 0.53674504 0.23809524 0.38575837 0.22537447 -0.41475753] mean value: 0.21663956046363453 key: train_mcc value: [0.62939175 0.56318771 0.50572841 0.54006981 0.56027975 0.52451345 0.63808526 0.55801254 0.55654161 0.57355974] mean value: 0.5649370021355652 key: test_accuracy value: [0.53846154 0.61538462 0.69230769 0.53846154 0.69230769 0.76923077 0.61538462 0.69230769 0.61538462 0.30769231] mean value: 0.6076923076923078 key: train_accuracy value: [0.81196581 0.77777778 0.75213675 0.76923077 0.77777778 0.76068376 0.81196581 0.76923077 0.74358974 0.78632479] mean value: 0.7760683760683761 key: test_fscore value: [0.57142857 0.61538462 0.66666667 0.57142857 0.66666667 0.8 0.61538462 0.75 0.70588235 0.4 ] mean value: 0.6362842059900883 key: train_fscore value: [0.82539683 0.796875 0.76422764 0.7804878 0.79365079 0.7704918 0.828125 0.79389313 0.79166667 0.78991597] mean value: 0.7934730632304993 key: test_precision value: [0.5 0.57142857 0.66666667 0.5 0.66666667 0.75 0.66666667 0.66666667 0.6 0.375 ] mean value: 0.5963095238095237 key: train_precision value: [0.7761194 0.73913043 0.734375 0.75 0.74626866 0.734375 0.75714286 0.71232877 0.6627907 0.7704918 ] mean value: 0.7383022619703353 key: test_recall value: [0.66666667 0.66666667 0.66666667 0.66666667 0.66666667 0.85714286 0.57142857 0.85714286 0.85714286 0.42857143] mean value: 0.6904761904761905 key: train_recall value: [0.88135593 0.86440678 0.79661017 0.81355932 0.84745763 0.81034483 0.9137931 0.89655172 0.98275862 0.81034483] mean value: 0.8617182933956751 key: test_roc_auc value: [0.54761905 0.61904762 0.69047619 0.54761905 0.69047619 0.76190476 0.61904762 0.67857143 0.5952381 0.29761905] mean value: 0.6047619047619048 key: train_roc_auc value: [0.81136762 0.77703098 0.75175336 0.76884863 0.77717709 0.76110462 0.81282876 0.77030976 0.7456166 0.78652835] mean value: 0.7762565751022794 key: test_jcc value: [0.4 0.44444444 0.5 0.4 0.5 0.66666667 0.44444444 0.6 0.54545455 0.25 ] mean value: 0.4751010101010101 key: train_jcc value: [0.7027027 0.66233766 0.61842105 0.64 0.65789474 0.62666667 0.70666667 0.65822785 0.65517241 0.65277778] mean value: 0.6580867527519529 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01078939 0.01074529 0.01073861 0.01068044 0.01072288 0.01063657 0.01069188 0.01077175 0.01050353 0.01064897] mean value: 0.010692930221557618 key: score_time value: [0.01027346 0.0101223 0.01015067 0.01009154 0.01016331 0.01005459 0.01008916 0.01028109 0.01001859 0.01019144] mean value: 0.010143613815307618 key: test_mcc value: [-0.05143445 0.54761905 0.53674504 -0.07142857 -0.28288947 0.54761905 0.28288947 0.38095238 0.38095238 -0.38575837] mean value: 0.18852665009433853 key: train_mcc value: [0.5393392 0.54074089 0.55597781 0.58971362 0.59133581 0.61080452 0.6087526 0.59794138 0.56027975 0.64168717] mean value: 0.5836572739419282 key: test_accuracy value: [0.46153846 0.76923077 0.76923077 0.46153846 0.38461538 0.76923077 0.61538462 0.69230769 0.69230769 0.30769231] mean value: 0.5923076923076923 key: train_accuracy value: [0.76923077 0.76923077 0.77777778 0.79487179 0.79487179 0.8034188 0.8034188 0.79487179 0.77777778 0.82051282] mean value: 0.7905982905982906 key: test_fscore value: [0.53333333 0.76923077 0.72727273 0.46153846 0.2 0.76923077 0.54545455 0.71428571 0.71428571 0.18181818] mean value: 0.5616450216450216 key: train_fscore value: [0.76521739 0.76106195 0.77586207 0.79661017 0.78947368 0.78899083 0.79279279 0.77358491 0.75925926 0.81415929] mean value: 0.7817012336310473 key: test_precision value: [0.44444444 0.71428571 0.8 0.42857143 0.25 0.83333333 0.75 0.71428571 0.71428571 0.25 ] mean value: 0.589920634920635 key: train_precision value: [0.78571429 0.7962963 0.78947368 0.79661017 0.81818182 0.84313725 0.83018868 0.85416667 0.82 0.83636364] mean value: 0.8170132491071999 key: test_recall value: [0.66666667 0.83333333 0.66666667 0.5 0.16666667 0.71428571 0.42857143 0.71428571 0.71428571 0.14285714] mean value: 0.5547619047619048 key: train_recall value: [0.74576271 0.72881356 0.76271186 0.79661017 0.76271186 0.74137931 0.75862069 0.70689655 0.70689655 0.79310345] mean value: 0.7503506721215664 key: test_roc_auc value: [0.47619048 0.77380952 0.76190476 0.46428571 0.36904762 0.77380952 0.63095238 0.69047619 0.69047619 0.32142857] mean value: 0.5952380952380952 key: train_roc_auc value: [0.76943308 0.76957919 0.77790766 0.79485681 0.79514904 0.80289305 0.80303916 0.79412624 0.77717709 0.82028054] mean value: 0.7904441846873174 key: test_jcc value: [0.36363636 0.625 0.57142857 0.3 0.11111111 0.625 0.375 0.55555556 0.55555556 0.1 ] mean value: 0.4182287157287157 key: train_jcc value: [0.61971831 0.61428571 0.63380282 0.66197183 0.65217391 0.65151515 0.65671642 0.63076923 0.6119403 0.68656716] mean value: 0.6419460847957068 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01046824 0.01135015 0.01026702 0.01008916 0.01003551 0.0104847 0.01012325 0.01084805 0.010113 0.01030326] mean value: 0.010408234596252442 key: score_time value: [0.01920247 0.0241816 0.01840687 0.02027678 0.01942277 0.01908255 0.02879333 0.0157578 0.01938295 0.0225172 ] mean value: 0.020702433586120606 key: test_mcc value: [ 0.38095238 0.38095238 0.05143445 0.54761905 -0.23809524 -0.05143445 -0.54761905 0.21957752 0.50709255 -0.7200823 ] mean value: 0.053039729323695814 key: train_mcc value: [0.38893486 0.31846508 0.49235618 0.42340863 0.38583198 0.37313533 0.47043398 0.38607028 0.39185302 0.39185302] mean value: 0.4022342373708152 key: test_accuracy value: [0.69230769 0.69230769 0.53846154 0.76923077 0.38461538 0.46153846 0.23076923 0.61538462 0.69230769 0.15384615] mean value: 0.5230769230769231 key: train_accuracy value: [0.69230769 0.65811966 0.74358974 0.70940171 0.69230769 0.68376068 0.73504274 0.69230769 0.69230769 0.69230769] mean value: 0.6991452991452991 key: test_fscore value: [0.66666667 0.66666667 0.4 0.76923077 0.33333333 0.36363636 0.28571429 0.66666667 0.6 0. ] mean value: 0.4751914751914752 key: train_fscore value: [0.67272727 0.64285714 0.72727273 0.69090909 0.68421053 0.64761905 0.72566372 0.67272727 0.65384615 0.65384615] mean value: 0.677167910493481 key: test_precision value: [0.66666667 0.66666667 0.5 0.71428571 0.33333333 0.5 0.28571429 0.625 1. 0. ] mean value: 0.5291666666666667 key: train_precision value: [0.7254902 0.67924528 0.78431373 0.74509804 0.70909091 0.72340426 0.74545455 0.71153846 0.73913043 0.73913043] mean value: 0.7301896284771464 key: test_recall value: [0.66666667 0.66666667 0.33333333 0.83333333 0.33333333 0.28571429 0.28571429 0.71428571 0.42857143 0. ] mean value: 0.45476190476190476 key: train_recall value: [0.62711864 0.61016949 0.6779661 0.6440678 0.66101695 0.5862069 0.70689655 0.63793103 0.5862069 0.5862069 ] mean value: 0.6323787258912916 key: test_roc_auc value: [0.69047619 0.69047619 0.52380952 0.77380952 0.38095238 0.47619048 0.22619048 0.60714286 0.71428571 0.16666667] mean value: 0.525 key: train_roc_auc value: [0.69286967 0.65853302 0.74415546 0.70996493 0.69257744 0.68293396 0.73480421 0.69184687 0.69140853 0.69140853] mean value: 0.6990502630040912 key: test_jcc value: [0.5 0.5 0.25 0.625 0.2 0.22222222 0.16666667 0.5 0.42857143 0. ] mean value: 0.33924603174603174 key: train_jcc value: [0.50684932 0.47368421 0.57142857 0.52777778 0.52 0.47887324 0.56944444 0.50684932 0.48571429 0.48571429] mean value: 0.5126335445179286 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01252365 0.01188517 0.01209474 0.01199675 0.01207352 0.01212645 0.01216841 0.01206255 0.01212716 0.01190042] mean value: 0.012095880508422852 key: score_time value: [0.01078773 0.01077604 0.01067948 0.01055169 0.01071 0.01069093 0.01049042 0.01056886 0.01054859 0.01059175] mean value: 0.010639548301696777 key: test_mcc value: [ 0.23809524 0.53674504 0.54761905 0.41475753 -0.23809524 0.54761905 0.14085904 0.53674504 0.41475753 -0.38095238] mean value: 0.2758149898990107 key: train_mcc value: [0.66472504 0.64361355 0.64361355 0.72698045 0.71044192 0.71177678 0.79485681 0.67593781 0.693731 0.67743539] mean value: 0.6943112296352275 key: test_accuracy value: [0.61538462 0.76923077 0.76923077 0.69230769 0.38461538 0.76923077 0.53846154 0.76923077 0.69230769 0.30769231] mean value: 0.6307692307692307 key: train_accuracy value: [0.82905983 0.82051282 0.82051282 0.86324786 0.85470085 0.85470085 0.8974359 0.83760684 0.84615385 0.83760684] mean value: 0.8461538461538461 key: test_fscore value: [0.61538462 0.72727273 0.76923077 0.71428571 0.33333333 0.76923077 0.4 0.8 0.66666667 0.30769231] mean value: 0.6103096903096903 key: train_fscore value: [0.81818182 0.81415929 0.81415929 0.86206897 0.85217391 0.84684685 0.89655172 0.83185841 0.83928571 0.82882883] mean value: 0.8404114801992302 key: test_precision value: [0.57142857 0.8 0.71428571 0.625 0.33333333 0.83333333 0.66666667 0.75 0.8 0.33333333] mean value: 0.6427380952380952 key: train_precision value: [0.88235294 0.85185185 0.85185185 0.87719298 0.875 0.88679245 0.89655172 0.85454545 0.87037037 0.86792453] mean value: 0.8714434157522146 key: test_recall value: [0.66666667 0.66666667 0.83333333 0.83333333 0.33333333 0.71428571 0.28571429 0.85714286 0.57142857 0.28571429] mean value: 0.6047619047619047 key: train_recall value: [0.76271186 0.77966102 0.77966102 0.84745763 0.83050847 0.81034483 0.89655172 0.81034483 0.81034483 0.79310345] mean value: 0.8120689655172414 key: test_roc_auc value: [0.61904762 0.76190476 0.77380952 0.70238095 0.38095238 0.77380952 0.55952381 0.76190476 0.70238095 0.30952381] mean value: 0.6345238095238096 key: train_roc_auc value: [0.82963179 0.82086499 0.82086499 0.86338399 0.85490941 0.85432496 0.8974284 0.8373758 0.84585038 0.83722969] mean value: 0.8461864406779661 key: test_jcc value: [0.44444444 0.57142857 0.625 0.55555556 0.2 0.625 0.25 0.66666667 0.5 0.18181818] mean value: 0.461991341991342 key: train_jcc value: [0.69230769 0.68656716 0.68656716 0.75757576 0.74242424 0.734375 0.8125 0.71212121 0.72307692 0.70769231] mean value: 0.7255207463556345 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [0.92927122 1.5305016 1.36941218 1.77340817 1.98329568 1.87210155 2.13366818 1.67377758 1.89807558 2.13494849] mean value: 1.7298460245132445 key: score_time value: [0.02280736 0.01201725 0.01104426 0.0204463 0.02194667 0.02331734 0.02230024 0.03766036 0.01530027 0.01807332] mean value: 0.02049133777618408 key: test_mcc value: [ 0.28288947 0.54761905 0.38095238 0.09759001 -0.07142857 0.73192505 -0.21957752 0.21957752 0.85714286 -0.41475753] mean value: 0.24119327202193427 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.76923077 0.69230769 0.53846154 0.46153846 0.84615385 0.38461538 0.61538462 0.92307692 0.30769231] mean value: 0.6153846153846154 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.76923077 0.66666667 0.57142857 0.46153846 0.83333333 0.33333333 0.66666667 0.92307692 0.4 ] mean value: 0.6291941391941391 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.71428571 0.66666667 0.5 0.42857143 1. 0.4 0.625 1. 0.375 ] mean value: 0.6265079365079365 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.83333333 0.83333333 0.66666667 0.66666667 0.5 0.71428571 0.28571429 0.71428571 0.85714286 0.42857143] mean value: 0.65 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.63095238 0.77380952 0.69047619 0.54761905 0.46428571 0.85714286 0.39285714 0.60714286 0.92857143 0.29761905] mean value: 0.6190476190476191 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.625 0.5 0.4 0.3 0.71428571 0.2 0.5 0.85714286 0.25 ] mean value: 0.48464285714285715 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.02312112 0.02405119 0.02581382 0.03118682 0.04043317 0.035918 0.03535628 0.03518629 0.01530671 0.02257276] mean value: 0.028894615173339844 key: score_time value: [0.02141547 0.03261065 0.03187895 0.02285576 0.02194715 0.02182031 0.02029324 0.02223802 0.01271105 0.01209235] mean value: 0.02198629379272461 key: test_mcc value: [0.23809524 0.6172134 0.85391256 0.85714286 0.69047619 1. 1. 0.85714286 0.85714286 0.73192505] mean value: 0.7703051018389734 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.76923077 0.92307692 0.92307692 0.84615385 1. 1. 0.92307692 0.92307692 0.84615385] mean value: 0.8769230769230769 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.61538462 0.8 0.90909091 0.92307692 0.83333333 1. 1. 0.92307692 0.92307692 0.83333333] mean value: 0.876037296037296 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.57142857 0.66666667 1. 0.85714286 0.83333333 1. 1. 1. 1. 1. ] mean value: 0.8928571428571428 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 1. 0.83333333 1. 0.83333333 1. 1. 0.85714286 0.85714286 0.71428571] mean value: 0.8761904761904762 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61904762 0.78571429 0.91666667 0.92857143 0.8452381 1. 1. 0.92857143 0.92857143 0.85714286] mean value: 0.8809523809523809 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.44444444 0.66666667 0.83333333 0.85714286 0.71428571 1. 1. 0.85714286 0.85714286 0.71428571] mean value: 0.7944444444444444 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.76 Accuracy on Blind test: 0.87 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.08921742 0.08900595 0.09549665 0.12468243 0.12501621 0.12542987 0.11990476 0.1254077 0.12537408 0.09747744] mean value: 0.11170125007629395 key: score_time value: [0.01770282 0.01791883 0.0226357 0.02327061 0.02341628 0.02357483 0.02346158 0.02348065 0.02347827 0.01771784] mean value: 0.021665740013122558 key: test_mcc value: [ 0.09759001 0.54761905 0.21957752 0.41475753 -0.09759001 1. 0.14085904 0.54761905 0.6172134 0.09759001] mean value: 0.3585235592252616 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.53846154 0.76923077 0.61538462 0.69230769 0.46153846 1. 0.53846154 0.76923077 0.76923077 0.53846154] mean value: 0.6692307692307693 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.57142857 0.76923077 0.54545455 0.71428571 0.36363636 1. 0.4 0.76923077 0.72727273 0.5 ] mean value: 0.636053946053946 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.5 0.71428571 0.6 0.625 0.4 1. 0.66666667 0.83333333 1. 0.6 ] mean value: 0.6939285714285715 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 0.83333333 0.5 0.83333333 0.33333333 1. 0.28571429 0.71428571 0.57142857 0.42857143] mean value: 0.6166666666666667 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.54761905 0.77380952 0.60714286 0.70238095 0.45238095 1. 0.55952381 0.77380952 0.78571429 0.54761905] mean value: 0.675 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.4 0.625 0.375 0.55555556 0.22222222 1. 0.25 0.625 0.57142857 0.33333333] mean value: 0.4957539682539682 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.00922561 0.01216745 0.00913382 0.00917363 0.00927591 0.00904441 0.00901008 0.00905704 0.00984097 0.0092206 ] mean value: 0.009514951705932617 key: score_time value: [0.00917149 0.01156068 0.00884557 0.00909114 0.00899363 0.00888038 0.00885415 0.00877523 0.00928211 0.00903535] mean value: 0.009248971939086914 key: test_mcc value: [ 0.28288947 0.38095238 0.05143445 0.28288947 -0.41475753 0.59160798 0.09759001 0.38095238 0.09759001 -0.53674504] mean value: 0.12144035835279779 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.69230769 0.53846154 0.61538462 0.30769231 0.76923077 0.53846154 0.69230769 0.53846154 0.23076923] mean value: 0.5538461538461539 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.66666667 0.4 0.66666667 0.18181818 0.82352941 0.5 0.71428571 0.5 0.16666667] mean value: 0.5286299974535269 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.66666667 0.5 0.55555556 0.2 0.7 0.6 0.71428571 0.6 0.2 ] mean value: 0.5292063492063492 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.83333333 0.66666667 0.33333333 0.83333333 0.16666667 1. 0.42857143 0.71428571 0.42857143 0.14285714] mean value: 0.5547619047619048 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.63095238 0.69047619 0.52380952 0.63095238 0.29761905 0.75 0.54761905 0.69047619 0.54761905 0.23809524] mean value: 0.5547619047619048 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.5 0.25 0.5 0.1 0.7 0.33333333 0.55555556 0.33333333 0.09090909] mean value: 0.38631313131313133 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: -0.29 Accuracy on Blind test: 0.4 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.12453771 1.23042583 1.6623981 1.80660892 1.10945749 1.1398468 1.11876416 1.10889626 1.09741282 2.23863149] mean value: 1.3636979579925537 key: score_time value: [0.0901432 0.105124 0.27335095 0.0891552 0.09121609 0.09351397 0.08874822 0.08856773 0.08960176 0.21340656] mean value: 0.1222827672958374 key: test_mcc value: [0.38095238 0.6172134 0.53674504 0.41475753 0.38575837 1. 0.39477102 0.73192505 0.28288947 0.28288947] mean value: 0.5027901748379063 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.69230769 0.76923077 0.76923077 0.69230769 0.69230769 1. 0.61538462 0.84615385 0.61538462 0.61538462] mean value: 0.7307692307692308 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.8 0.72727273 0.71428571 0.6 1. 0.44444444 0.83333333 0.54545455 0.54545455] mean value: 0.6876911976911977 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.66666667 0.66666667 0.8 0.625 0.75 1. 1. 1. 0.75 0.75 ] mean value: 0.8008333333333333 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 1. 0.66666667 0.83333333 0.5 1. 0.28571429 0.71428571 0.42857143 0.42857143] mean value: 0.6523809523809524 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.69047619 0.78571429 0.76190476 0.70238095 0.67857143 1. 0.64285714 0.85714286 0.63095238 0.63095238] mean value: 0.7380952380952381 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.5 0.66666667 0.57142857 0.55555556 0.42857143 1. 0.28571429 0.71428571 0.375 0.375 ] mean value: 0.5472222222222222 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [2.01023936 1.5312047 1.59749556 1.66958928 1.76442599 1.60279465 1.98377275 1.63427329 1.65822673 1.49137998] mean value: 1.6943402290344238 key: score_time value: [0.21109867 0.20521808 0.27385473 0.18928909 0.19972277 0.15325022 0.16408372 0.16386247 0.17173505 0.19090343] mean value: 0.1923018217086792 key: test_mcc value: [0.23809524 0.6172134 0.69047619 0.6172134 0.38095238 1. 0.39477102 0.85391256 0.41475753 0.09759001] mean value: 0.5304981728324353 key: train_mcc value: [0.94994292 0.94994292 0.96636481 0.98304594 0.96636481 0.93384219 0.98305085 0.94998574 0.93161894 0.94998574] mean value: 0.9564144838510643 key: test_accuracy value: [0.61538462 0.76923077 0.84615385 0.76923077 0.69230769 1. 0.61538462 0.92307692 0.69230769 0.53846154] mean value: 0.7461538461538462 key: train_accuracy value: [0.97435897 0.97435897 0.98290598 0.99145299 0.98290598 0.96581197 0.99145299 0.97435897 0.96581197 0.97435897] mean value: 0.9777777777777777 key: test_fscore value: [0.61538462 0.8 0.83333333 0.8 0.66666667 1. 0.44444444 0.93333333 0.66666667 0.5 ] mean value: 0.7259829059829059 key: train_fscore value: [0.97520661 0.97520661 0.98333333 0.99159664 0.98333333 0.96666667 0.99145299 0.97478992 0.96551724 0.97478992] mean value: 0.9781893259894366 key: test_precision value: [0.57142857 0.66666667 0.83333333 0.66666667 0.66666667 1. 1. 0.875 0.8 0.6 ] mean value: 0.7679761904761905 key: train_precision value: [0.9516129 0.9516129 0.96721311 0.98333333 0.96721311 0.93548387 0.98305085 0.95081967 0.96551724 0.95081967] mean value: 0.9606676673360117 key: test_recall value: [0.66666667 1. 0.83333333 1. 0.66666667 1. 0.28571429 1. 0.57142857 0.42857143] mean value: 0.7452380952380953 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 0.96551724 1. ] mean value: 0.996551724137931 key: test_roc_auc value: [0.61904762 0.78571429 0.8452381 0.78571429 0.69047619 1. 0.64285714 0.91666667 0.70238095 0.54761905] mean value: 0.7535714285714286 key: train_roc_auc value: [0.97413793 0.97413793 0.98275862 0.99137931 0.98275862 0.96610169 0.99152542 0.97457627 0.96580947 0.97457627] mean value: 0.9777761542957335 key: test_jcc value: [0.44444444 0.66666667 0.71428571 0.66666667 0.5 1. 0.28571429 0.875 0.5 0.33333333] mean value: 0.5986111111111111 key: train_jcc value: [0.9516129 0.9516129 0.96721311 0.98333333 0.96721311 0.93548387 0.98305085 0.95081967 0.93333333 0.95081967] mean value: 0.957449276531414 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.0101862 0.00909281 0.0090611 0.0091393 0.00983238 0.00917673 0.0095005 0.0091691 0.00907755 0.00950789] mean value: 0.009374356269836426 key: score_time value: [0.00923538 0.00892949 0.00899577 0.0089128 0.00890374 0.00899363 0.00957346 0.00951457 0.00898981 0.00949144] mean value: 0.009154009819030761 key: test_mcc value: [-0.05143445 0.54761905 0.53674504 -0.07142857 -0.28288947 0.54761905 0.28288947 0.38095238 0.38095238 -0.38575837] mean value: 0.18852665009433853 key: train_mcc value: [0.5393392 0.54074089 0.55597781 0.58971362 0.59133581 0.61080452 0.6087526 0.59794138 0.56027975 0.64168717] mean value: 0.5836572739419282 key: test_accuracy value: [0.46153846 0.76923077 0.76923077 0.46153846 0.38461538 0.76923077 0.61538462 0.69230769 0.69230769 0.30769231] mean value: 0.5923076923076923 key: train_accuracy value: [0.76923077 0.76923077 0.77777778 0.79487179 0.79487179 0.8034188 0.8034188 0.79487179 0.77777778 0.82051282] mean value: 0.7905982905982906 key: test_fscore value: [0.53333333 0.76923077 0.72727273 0.46153846 0.2 0.76923077 0.54545455 0.71428571 0.71428571 0.18181818] mean value: 0.5616450216450216 key: train_fscore value: [0.76521739 0.76106195 0.77586207 0.79661017 0.78947368 0.78899083 0.79279279 0.77358491 0.75925926 0.81415929] mean value: 0.7817012336310473 key: test_precision value: [0.44444444 0.71428571 0.8 0.42857143 0.25 0.83333333 0.75 0.71428571 0.71428571 0.25 ] mean value: 0.589920634920635 key: train_precision value: [0.78571429 0.7962963 0.78947368 0.79661017 0.81818182 0.84313725 0.83018868 0.85416667 0.82 0.83636364] mean value: 0.8170132491071999 key: test_recall value: [0.66666667 0.83333333 0.66666667 0.5 0.16666667 0.71428571 0.42857143 0.71428571 0.71428571 0.14285714] mean value: 0.5547619047619048 key: train_recall value: [0.74576271 0.72881356 0.76271186 0.79661017 0.76271186 0.74137931 0.75862069 0.70689655 0.70689655 0.79310345] mean value: 0.7503506721215664 key: test_roc_auc value: [0.47619048 0.77380952 0.76190476 0.46428571 0.36904762 0.77380952 0.63095238 0.69047619 0.69047619 0.32142857] mean value: 0.5952380952380952 key: train_roc_auc value: [0.76943308 0.76957919 0.77790766 0.79485681 0.79514904 0.80289305 0.80303916 0.79412624 0.77717709 0.82028054] mean value: 0.7904441846873174 key: test_jcc value: [0.36363636 0.625 0.57142857 0.3 0.11111111 0.625 0.375 0.55555556 0.55555556 0.1 ] mean value: 0.4182287157287157 key: train_jcc value: [0.61971831 0.61428571 0.63380282 0.66197183 0.65217391 0.65151515 0.65671642 0.63076923 0.6119403 0.68656716] mean value: 0.6419460847957068 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [3.79892635 4.06916142 3.78275681 3.70944476 3.7265656 3.06774664 1.29508638 1.30997825 1.30333805 1.31682229] mean value: 2.737982654571533 key: score_time value: [0.03303289 0.02331901 0.03021288 0.01799679 0.02469015 0.01262164 0.01303506 0.01236606 0.0128386 0.01293612] mean value: 0.019304919242858886 key: test_mcc value: [0.23809524 0.73192505 1. 0.73192505 0.85714286 1. 0.73192505 1. 0.85714286 1. ] mean value: 0.8148156116515152 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.84615385 1. 0.84615385 0.92307692 1. 0.84615385 1. 0.92307692 1. ] mean value: 0.9 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.61538462 0.85714286 1. 0.85714286 0.92307692 1. 0.83333333 1. 0.92307692 1. ] mean value: 0.9009157509157508 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.57142857 0.75 1. 0.75 0.85714286 1. 1. 1. 1. 1. ] mean value: 0.8928571428571428 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 1. 1. 1. 1. 1. 0.71428571 1. 0.85714286 1. ] mean value: 0.9238095238095239 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61904762 0.85714286 1. 0.85714286 0.92857143 1. 0.85714286 1. 0.92857143 1. ] mean value: 0.9047619047619048 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.44444444 0.75 1. 0.75 0.85714286 1. 0.71428571 1. 0.85714286 1. ] mean value: 0.8373015873015873 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.02560902 0.0465641 0.0467689 0.04626966 0.04703498 0.03659153 0.07281756 0.05043769 0.05987668 0.03430367] mean value: 0.046627378463745116 key: score_time value: [0.02470279 0.02194142 0.02393842 0.02235103 0.01229 0.0124898 0.02689171 0.02636933 0.01218748 0.01236296] mean value: 0.01955249309539795 key: test_mcc value: [ 0.28288947 0.23809524 -0.07142857 -0.09759001 0.23809524 -0.05143445 0.28288947 0.21957752 0.23809524 0.07142857] mean value: 0.13506177232779207 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.61538462 0.46153846 0.46153846 0.61538462 0.46153846 0.61538462 0.61538462 0.61538462 0.53846154] mean value: 0.5615384615384615 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.61538462 0.46153846 0.36363636 0.61538462 0.36363636 0.54545455 0.66666667 0.61538462 0.57142857] mean value: 0.5485181485181485 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.55555556 0.57142857 0.42857143 0.4 0.57142857 0.5 0.75 0.625 0.66666667 0.57142857] mean value: 0.5640079365079365 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.83333333 0.66666667 0.5 0.33333333 0.66666667 0.28571429 0.42857143 0.71428571 0.57142857 0.57142857] mean value: 0.5571428571428572 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.63095238 0.61904762 0.46428571 0.45238095 0.61904762 0.47619048 0.63095238 0.60714286 0.61904762 0.53571429] mean value: 0.5654761904761905 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.44444444 0.3 0.22222222 0.44444444 0.22222222 0.375 0.5 0.44444444 0.4 ] mean value: 0.3852777777777778 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.39 Accuracy on Blind test: 0.67 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.03235841 0.00919366 0.00935245 0.0087781 0.00871897 0.00909734 0.00957131 0.0088644 0.00907183 0.00900698] mean value: 0.01140134334564209 key: score_time value: [0.01621103 0.00943017 0.00886726 0.00846529 0.00861454 0.00898194 0.00923371 0.00882483 0.0093236 0.00886536] mean value: 0.009681773185729981 key: test_mcc value: [ 0.41475753 0.23809524 0.69047619 0.09759001 -0.23809524 0.38095238 -0.07142857 0.38095238 0.54761905 -0.23809524] mean value: 0.22028237287741703 key: train_mcc value: [0.45433325 0.43583749 0.50423855 0.45295149 0.52149771 0.41876096 0.50511865 0.45433325 0.48858389 0.47019287] mean value: 0.47058481235241056 key: test_accuracy value: [0.69230769 0.61538462 0.84615385 0.53846154 0.38461538 0.69230769 0.46153846 0.69230769 0.76923077 0.38461538] mean value: 0.6076923076923078 key: train_accuracy value: [0.72649573 0.71794872 0.75213675 0.72649573 0.76068376 0.70940171 0.75213675 0.72649573 0.74358974 0.73504274] mean value: 0.7350427350427351 key: test_fscore value: [0.71428571 0.61538462 0.83333333 0.57142857 0.33333333 0.71428571 0.46153846 0.71428571 0.76923077 0.42857143] mean value: 0.6155677655677656 key: train_fscore value: [0.71929825 0.72268908 0.75630252 0.72881356 0.76666667 0.70689655 0.75630252 0.73333333 0.75 0.73504274] mean value: 0.737534520935 key: test_precision value: [0.625 0.57142857 0.83333333 0.5 0.33333333 0.71428571 0.5 0.71428571 0.83333333 0.42857143] mean value: 0.6053571428571428 key: train_precision value: [0.74545455 0.71666667 0.75 0.72881356 0.75409836 0.70689655 0.73770492 0.70967742 0.72580645 0.72881356] mean value: 0.7303932032145685 key: test_recall value: [0.83333333 0.66666667 0.83333333 0.66666667 0.33333333 0.71428571 0.42857143 0.71428571 0.71428571 0.42857143] mean value: 0.6333333333333333 key: train_recall value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) [0.69491525 0.72881356 0.76271186 0.72881356 0.77966102 0.70689655 0.77586207 0.75862069 0.77586207 0.74137931] mean value: 0.7453535943892461 key: test_roc_auc value: [0.70238095 0.61904762 0.8452381 0.54761905 0.38095238 0.69047619 0.46428571 0.69047619 0.77380952 0.38095238] mean value: 0.6095238095238096 key: train_roc_auc value: [0.72676797 0.71785506 0.75204559 0.72647575 0.76052016 0.70938048 0.75233781 0.72676797 0.74386324 0.73509643] mean value: 0.7351110461718293 key: test_jcc value: [0.55555556 0.44444444 0.71428571 0.4 0.2 0.55555556 0.3 0.55555556 0.625 0.27272727] mean value: 0.46231240981240984 key: train_jcc value: [0.56164384 0.56578947 0.60810811 0.57333333 0.62162162 0.54666667 0.60810811 0.57894737 0.6 0.58108108] mean value: 0.5845299596640621 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01001859 0.01448417 0.01453733 0.01442242 0.01421142 0.01466846 0.01452231 0.01492524 0.03460717 0.01412797] mean value: 0.016052508354187013 key: score_time value: [0.00898385 0.01171184 0.01143146 0.01166272 0.01168752 0.01167083 0.01222038 0.01173306 0.01181865 0.01167536] mean value: 0.011459565162658692 key: test_mcc value: [ 0.39477102 0.59160798 0.54761905 -0.09759001 0.41475753 0.85714286 0. 0.41475753 0.39477102 -0.05143445] mean value: 0.3466402521747625 key: train_mcc value: [0.5256472 0.70108874 0.96580947 0.82695916 0.79157144 0.8524126 0.55242412 0.75475504 0.4300616 0.93214426] mean value: 0.7332873650973463 key: test_accuracy value: [0.61538462 0.76923077 0.76923077 0.46153846 0.69230769 0.92307692 0.46153846 0.69230769 0.61538462 0.46153846] mean value: 0.6461538461538462 key: train_accuracy value: [0.72649573 0.82905983 0.98290598 0.90598291 0.88888889 0.92307692 0.73504274 0.86324786 0.65811966 0.96581197] mean value: 0.8478632478632478 key: test_fscore value: [0.70588235 0.66666667 0.76923077 0.36363636 0.71428571 0.92307692 0. 0.66666667 0.44444444 0.36363636] mean value: 0.5617526264585088 key: train_fscore value: [0.78378378 0.79591837 0.98305085 0.89719626 0.89922481 0.92682927 0.63529412 0.84 0.47368421 0.96491228] mean value: 0.8199893943639955 key: test_precision value: [0.54545455 1. 0.71428571 0.4 0.625 1. 0. 0.8 1. 0.5 ] mean value: 0.6584740259740259 key: train_precision value: [0.65168539 1. 0.98305085 1. 0.82857143 0.87692308 1. 1. 1. 0.98214286] mean value: 0.9322373603353417 key: test_recall value: [1. 0.5 0.83333333 0.33333333 0.83333333 0.85714286 0. 0.57142857 0.28571429 0.28571429] mean value: 0.55 key: train_recall value: [0.98305085 0.66101695 0.98305085 0.81355932 0.98305085 0.98275862 0.46551724 0.72413793 0.31034483 0.94827586] mean value: 0.7854763296317943 key: test_roc_auc value: [0.64285714 0.75 0.77380952 0.45238095 0.70238095 0.92857143 0.5 0.70238095 0.64285714 0.47619048] mean value: 0.6571428571428571 key: train_roc_auc value: [0.72428404 0.83050847 0.98290473 0.90677966 0.88807715 0.9235827 0.73275862 0.86206897 0.65517241 0.96566335] mean value: 0.8471800116890708 key: test_jcc value: [0.54545455 0.5 0.625 0.22222222 0.55555556 0.85714286 0. 0.5 0.28571429 0.22222222] mean value: 0.43133116883116884 key: train_jcc value: [0.64444444 0.66101695 0.96666667 0.81355932 0.81690141 0.86363636 0.46551724 0.72413793 0.31034483 0.93220339] mean value: 0.7198428544215129 MCC on Blind test: 0.48 Accuracy on Blind test: 0.73 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01358604 0.01421356 0.0135591 0.01319122 0.01309133 0.01321769 0.01343799 0.01352835 0.03366351 0.013412 ] mean value: 0.015490078926086425 key: score_time value: [0.01017308 0.01169968 0.0117135 0.0116725 0.01163697 0.01164961 0.0117476 0.01171494 0.01705313 0.01171875] mean value: 0.012077975273132324 key: test_mcc value: [0.03289758 0.54761905 0.7200823 0.23809524 0.23809524 0.54761905 0.09759001 0.41475753 0.54761905 0.05143445] mean value: 0.3435809491904047 key: train_mcc value: [0.70108874 0.96580947 0.79806402 0.79924461 0.76221784 0.8120433 0.88348376 0.82644112 0.83358601 0.75745182] mean value: 0.8139430690182574 key: test_accuracy value: [0.53846154 0.76923077 0.84615385 0.61538462 0.61538462 0.76923077 0.53846154 0.69230769 0.76923077 0.53846154] mean value: 0.6692307692307693 key: train_accuracy value: [0.82905983 0.98290598 0.88888889 0.8974359 0.87179487 0.90598291 0.94017094 0.90598291 0.91452991 0.87179487] mean value: 0.9008547008547009 key: test_fscore value: [0.25 0.76923077 0.8 0.61538462 0.61538462 0.76923077 0.5 0.66666667 0.76923077 0.625 ] mean value: 0.6380128205128205 key: train_fscore value: [0.79591837 0.98305085 0.87619048 0.89285714 0.88549618 0.90434783 0.93693694 0.8952381 0.91803279 0.88188976] mean value: 0.8969958425985054 key: test_precision value: [0.5 0.71428571 1. 0.57142857 0.57142857 0.83333333 0.6 0.8 0.83333333 0.55555556] mean value: 0.697936507936508 key: train_precision value: [1. 0.98305085 1. 0.94339623 0.80555556 0.9122807 0.98113208 1. 0.875 0.8115942 ] mean value: 0.9312009609552911 key: test_recall value: [0.16666667 0.83333333 0.66666667 0.66666667 0.66666667 0.71428571 0.42857143 0.57142857 0.71428571 0.71428571] mean value: 0.6142857142857143 key: train_recall value: [0.66101695 0.98305085 0.77966102 0.84745763 0.98305085 0.89655172 0.89655172 0.81034483 0.96551724 0.96551724] mean value: 0.8788720046756283 key: test_roc_auc value: [0.51190476 0.77380952 0.83333333 0.61904762 0.61904762 0.77380952 0.54761905 0.70238095 0.77380952 0.52380952] mean value: 0.6678571428571429 key: train_roc_auc value: [0.83050847 0.98290473 0.88983051 0.89786674 0.87083577 0.90590298 0.93980129 0.90517241 0.91496201 0.87258913] mean value: 0.9010374050263005 key: test_jcc value: [0.14285714 0.625 0.66666667 0.44444444 0.44444444 0.625 0.33333333 0.5 0.625 0.45454545] mean value: 0.48612914862914863 key: train_jcc value: [0.66101695 0.96666667 0.77966102 0.80645161 0.79452055 0.82539683 0.88135593 0.81034483 0.84848485 0.78873239] mean value: 0.816263162165426 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.10403562 0.09558034 0.09651375 0.09247088 0.09357142 0.09638667 0.09789968 0.09478498 0.09305978 0.0943675 ] mean value: 0.09586706161499023 key: score_time value: [0.01546073 0.015697 0.01498866 0.01505232 0.01517606 0.01585627 0.01545739 0.01489782 0.01497364 0.01499128] mean value: 0.015255117416381836 key: test_mcc value: [0.73192505 0.85714286 0.85391256 0.73192505 0.85714286 1. 0.73192505 0.69047619 0.85391256 0.85714286] mean value: 0.8165505053698895 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.84615385 0.92307692 0.92307692 0.84615385 0.92307692 1. 0.84615385 0.84615385 0.92307692 0.92307692] mean value: 0.9 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.85714286 0.92307692 0.90909091 0.85714286 0.92307692 1. 0.83333333 0.85714286 0.93333333 0.92307692] mean value: 0.9016416916416916 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.75 0.85714286 1. 0.75 0.85714286 1. 1. 0.85714286 0.875 1. ] mean value: 0.8946428571428571 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.83333333 1. 1. 1. 0.71428571 0.85714286 1. 0.85714286] mean value: 0.9261904761904762 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85714286 0.92857143 0.91666667 0.85714286 0.92857143 1. 0.85714286 0.8452381 0.91666667 0.92857143] mean value: 0.9035714285714286 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.75 0.85714286 0.83333333 0.75 0.85714286 1. 0.71428571 0.75 0.875 0.85714286] mean value: 0.8244047619047619 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.03330159 0.04289031 0.03018212 0.03177643 0.03046227 0.03062987 0.03917623 0.05034637 0.03564668 0.03854513] mean value: 0.03629570007324219 key: score_time value: [0.02010465 0.01803446 0.02383947 0.01709294 0.01847744 0.01818728 0.03815055 0.02306819 0.02303863 0.02940965] mean value: 0.02294032573699951 key: test_mcc value: [0.38095238 0.73192505 0.85391256 0.54761905 0.85714286 0.85391256 0.73192505 1. 0.85391256 0.73192505] mean value: 0.7543227141338386 key: train_mcc value: [0.96580947 1. 1. 0.96580947 1. 0.96638414 1. 0.96638414 0.96638414 0.96580947] mean value: 0.9796580822953629 key: test_accuracy value: [0.69230769 0.84615385 0.92307692 0.76923077 0.92307692 0.92307692 0.84615385 1. 0.92307692 0.84615385] mean value: 0.8692307692307693 key: train_accuracy value: [0.98290598 1. 1. 0.98290598 1. 0.98290598 1. 0.98290598 0.98290598 0.98290598] mean value: 0.9897435897435897 key: test_fscore value: [0.66666667 0.85714286 0.90909091 0.76923077 0.92307692 0.93333333 0.83333333 1. 0.93333333 0.83333333] mean value: 0.8658541458541458 key: train_fscore value: [0.98305085 1. 1. 0.98305085 1. 0.98305085 1. 0.98305085 0.98305085 0.98275862] mean value: 0.989801285797779 key: test_precision value: [0.66666667 0.75 1. 0.71428571 0.85714286 0.875 1. 1. 0.875 1. ] mean value: 0.8738095238095238 key: train_precision value: [0.98305085 1. 1. 0.98305085 1. 0.96666667 1. 0.96666667 0.96666667 0.98275862] mean value: 0.9848860315604909 key: test_recall value: [0.66666667 1. 0.83333333 0.83333333 1. 1. 0.71428571 1. 1. 0.71428571] mean value: 0.8761904761904762 key: train_recall value: [0.98305085 1. 1. 0.98305085 1. 1. 1. 1. 1. 0.98275862] mean value: 0.9948860315604909 key: test_roc_auc value: [0.69047619 0.85714286 0.91666667 0.77380952 0.92857143 0.91666667 0.85714286 1. 0.91666667 0.85714286] mean value: 0.8714285714285714 key: train_roc_auc value: [0.98290473 1. 1. 0.98290473 1. 0.98305085 1. 0.98305085 0.98305085 0.98290473] mean value: 0.9897866744593804 key: test_jcc value: [0.5 0.75 0.83333333 0.625 0.85714286 0.875 0.71428571 1. 0.875 0.71428571] mean value: 0.7744047619047619 key: train_jcc value: [0.96666667 1. 1. 0.96666667 1. 0.96666667 1. 0.96666667 0.96666667 0.96610169] mean value: 0.9799435028248588 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.03159785 0.0388267 0.0496254 0.0527494 0.06397605 0.05230594 0.03799987 0.04411817 0.04123425 0.05326414] mean value: 0.04656977653503418 key: score_time value: [0.0215857 0.03370619 0.04122758 0.03210497 0.02467465 0.02374005 0.0220952 0.02108288 0.02432179 0.02440858] mean value: 0.026894760131835938 key: test_mcc value: [ 0.23809524 0.53674504 0.07142857 0.21957752 -0.54761905 0.54761905 0.14085904 0.09759001 0.50709255 -0.23809524] mean value: 0.15732927305504008 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61538462 0.76923077 0.53846154 0.61538462 0.23076923 0.76923077 0.53846154 0.53846154 0.69230769 0.38461538] mean value: 0.5692307692307692 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.61538462 0.72727273 0.5 0.54545455 0.16666667 0.76923077 0.4 0.5 0.6 0.42857143] mean value: 0.5252580752580752 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.57142857 0.8 0.5 0.6 0.16666667 0.83333333 0.66666667 0.6 1. 0.42857143] mean value: 0.6166666666666667 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.66666667 0.66666667 0.5 0.5 0.16666667 0.71428571 0.28571429 0.42857143 0.42857143 0.42857143] mean value: 0.47857142857142854 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61904762 0.76190476 0.53571429 0.60714286 0.22619048 0.77380952 0.55952381 0.54761905 0.71428571 0.38095238] mean value: 0.5726190476190476 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.44444444 0.57142857 0.33333333 0.375 0.09090909 0.625 0.25 0.33333333 0.42857143 0.27272727] mean value: 0.37247474747474746 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.33 Accuracy on Blind test: 0.67 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.27359462 0.25197363 0.24951911 0.24853635 0.2603178 0.25177169 0.26295996 0.25529766 0.24311304 0.25735235] mean value: 0.2554436206817627 key: score_time value: [0.00972319 0.00941014 0.00939226 0.00937867 0.00994706 0.00936031 0.00973701 0.00932145 0.0103333 0.00949168] mean value: 0.009609508514404296 key: test_mcc value: [0.73192505 0.73192505 0.85391256 0.73192505 0.85714286 1. 0.73192505 1. 0.85714286 0.73192505] mean value: 0.822782355167268 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.84615385 0.84615385 0.92307692 0.84615385 0.92307692 1. 0.84615385 1. 0.92307692 0.84615385] mean value: 0.9 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.85714286 0.85714286 0.90909091 0.85714286 0.92307692 1. 0.83333333 1. 0.92307692 0.83333333] mean value: 0.8993339993339993 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.75 0.75 1. 0.75 0.85714286 1. 1. 1. 1. 1. ] mean value: 0.9107142857142857 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 0.83333333 1. 1. 1. 0.71428571 1. 0.85714286 0.71428571] mean value: 0.9119047619047619 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.85714286 0.85714286 0.91666667 0.85714286 0.92857143 1. 0.85714286 1. 0.92857143 0.85714286] mean value: 0.905952380952381 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.75 0.75 0.83333333 0.75 0.85714286 1. 0.71428571 1. 0.85714286 0.71428571] mean value: 0.8226190476190476 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.76 Accuracy on Blind test: 0.87 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.01716733 0.01644683 0.01679516 0.01655555 0.01668715 0.0164144 0.01659274 0.01631236 0.01657629 0.01654887] mean value: 0.016609668731689453 key: score_time value: [0.01222205 0.01211405 0.01213098 0.01435637 0.01459837 0.01452422 0.01214623 0.0120852 0.01212072 0.01458526] mean value: 0.013088345527648926 key: test_mcc value: [-0.28288947 0.07142857 -0.21957752 0.09759001 -0.28288947 0.23809524 -0.22537447 0.05143445 -0.05143445 0.09759001] mean value: -0.050602711008851185 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.38461538 0.53846154 0.38461538 0.53846154 0.38461538 0.61538462 0.38461538 0.53846154 0.46153846 0.53846154] mean value: 0.47692307692307695 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.2 0.5 0.42857143 0.57142857 0.2 0.61538462 0.2 0.625 0.36363636 0.5 ] mean value: 0.4204020979020979 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.25 0.5 0.375 0.5 0.25 0.66666667 0.33333333 0.55555556 0.5 0.6 ] mean value: 0.45305555555555554 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.16666667 0.5 0.5 0.66666667 0.16666667 0.57142857 0.14285714 0.71428571 0.28571429 0.42857143] mean value: 0.41428571428571426 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.36904762 0.53571429 0.39285714 0.54761905 0.36904762 0.61904762 0.4047619 0.52380952 0.47619048 0.54761905] mean value: 0.4785714285714286 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.11111111 0.33333333 0.27272727 0.4 0.11111111 0.44444444 0.11111111 0.45454545 0.22222222 0.33333333] mean value: 0.27939393939393936 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: -0.12 Accuracy on Blind test: 0.4 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.04744411 0.03657794 0.03558445 0.03328729 0.03785634 0.03590202 0.03718376 0.03619504 0.0340116 0.02791476] mean value: 0.0361957311630249 key: score_time value: [0.02389693 0.0237062 0.02078629 0.02116799 0.02398634 0.023525 0.02397728 0.02406883 0.02208757 0.0217185 ] mean value: 0.022892093658447264 key: test_mcc value: [0.41475753 0.54761905 0.73192505 0.38095238 0.38095238 0.85714286 0.23809524 0.21957752 0.85391256 0.23809524] mean value: 0.4863029808815056 key: train_mcc value: [0.96580947 0.93214426 0.94884541 0.89792372 0.89792372 0.93161894 0.98305085 0.96580947 0.93161894 0.96580947] mean value: 0.9420554222581047 key: test_accuracy value: [0.69230769 0.76923077 0.84615385 0.69230769 0.69230769 0.92307692 0.61538462 0.61538462 0.92307692 0.61538462] mean value: 0.7384615384615385 key: train_accuracy value: [0.98290598 0.96581197 0.97435897 0.94871795 0.94871795 0.96581197 0.99145299 0.98290598 0.96581197 0.98290598] mean value: 0.9709401709401709 key: test_fscore value: [0.71428571 0.76923077 0.85714286 0.66666667 0.66666667 0.92307692 0.61538462 0.66666667 0.93333333 0.61538462] mean value: 0.7427838827838827 key: train_fscore value: [0.98305085 0.96666667 0.97478992 0.95 0.95 0.96551724 0.99145299 0.98275862 0.96551724 0.98275862] mean value: 0.9712512145681603 key: test_precision value: [0.625 0.71428571 0.75 0.66666667 0.66666667 1. 0.66666667 0.625 0.875 0.66666667] mean value: 0.7255952380952381 key: train_precision value: [0.98305085 0.95081967 0.96666667 0.93442623 0.93442623 0.96551724 0.98305085 0.98275862 0.96551724 0.98275862] mean value: 0.9648992216867394 key: test_recall value: [0.83333333 0.83333333 1. 0.66666667 0.66666667 0.85714286 0.57142857 0.71428571 1. 0.57142857] mean value: 0.7714285714285715 key: train_recall value: /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:168: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:171: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) [0.98305085 0.98305085 0.98305085 0.96610169 0.96610169 0.96551724 1. 0.98275862 0.96551724 0.98275862] mean value: 0.9777907656341321 key: test_roc_auc value: [0.70238095 0.77380952 0.85714286 0.69047619 0.69047619 0.92857143 0.61904762 0.60714286 0.91666667 0.61904762] mean value: 0.7404761904761905 key: train_roc_auc value: [0.98290473 0.96566335 0.97428404 0.94856809 0.94856809 0.96580947 0.99152542 0.98290473 0.96580947 0.98290473] mean value: 0.9708942139099942 key: test_jcc value: [0.55555556 0.625 0.75 0.5 0.5 0.85714286 0.44444444 0.5 0.875 0.44444444] mean value: 0.6051587301587301 key: train_jcc value: [0.96666667 0.93548387 0.95081967 0.9047619 0.9047619 0.93333333 0.98305085 0.96610169 0.93333333 0.96610169] mean value: 0.9444414923244168 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.31864738 0.31613731 0.37270212 0.30933118 0.3532021 0.42899418 0.34164643 0.27218747 0.27531123 0.27682853] mean value: 0.32649879455566405 key: score_time value: [0.03583121 0.02314615 0.02238393 0.02144551 0.02395582 0.02265859 0.02140379 0.02456188 0.02235389 0.02389455] mean value: 0.024163532257080077 key: test_mcc value: [0.41475753 0.54761905 0.73192505 0.38095238 0.38095238 0.85714286 0.23809524 0.21957752 0.85391256 0.23809524] mean value: 0.4863029808815056 key: train_mcc value: [0.96580947 0.98304594 0.94884541 0.89792372 0.89792372 0.93161894 1. 0.96580947 0.93161894 0.96580947] mean value: 0.9488405049573129 key: test_accuracy value: [0.69230769 0.76923077 0.84615385 0.69230769 0.69230769 0.92307692 0.61538462 0.61538462 0.92307692 0.61538462] mean value: 0.7384615384615385 key: train_accuracy value: [0.98290598 0.99145299 0.97435897 0.94871795 0.94871795 0.96581197 1. 0.98290598 0.96581197 0.98290598] mean value: 0.9743589743589743 key: test_fscore value: [0.71428571 0.76923077 0.85714286 0.66666667 0.66666667 0.92307692 0.61538462 0.66666667 0.93333333 0.61538462] mean value: 0.7427838827838827 key: train_fscore value: [0.98305085 0.99159664 0.97478992 0.95 0.95 0.96551724 1. 0.98275862 0.96551724 0.98275862] mean value: 0.9745989126217407 key: test_precision value: [0.625 0.71428571 0.75 0.66666667 0.66666667 1. 0.66666667 0.625 0.875 0.66666667] mean value: 0.7255952380952381 key: train_precision value: [0.98305085 0.98333333 0.96666667 0.93442623 0.93442623 0.96551724 1. 0.98275862 0.96551724 0.98275862] mean value: 0.9698455030611952 key: test_recall value: [0.83333333 0.83333333 1. 0.66666667 0.66666667 0.85714286 0.57142857 0.71428571 1. 0.57142857] mean value: 0.7714285714285715 key: train_recall value: [0.98305085 1. 0.98305085 0.96610169 0.96610169 0.96551724 1. 0.98275862 0.96551724 0.98275862] mean value: 0.9794856808883694 key: test_roc_auc value: [0.70238095 0.77380952 0.85714286 0.69047619 0.69047619 0.92857143 0.61904762 0.60714286 0.91666667 0.61904762] mean value: 0.7404761904761905 key: train_roc_auc value: [0.98290473 0.99137931 0.97428404 0.94856809 0.94856809 0.96580947 1. 0.98290473 0.96580947 0.98290473] mean value: 0.9743132670952659 key: test_jcc value: [0.55555556 0.625 0.75 0.5 0.5 0.85714286 0.44444444 0.5 0.875 0.44444444] mean value: 0.6051587301587301 key: train_jcc value: [0.96666667 0.98333333 0.95081967 0.9047619 0.9047619 0.93333333 1. 0.96610169 0.93333333 0.96610169] mean value: 0.9509213538152133 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Logistic Regression Model func: LogisticRegression(random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegression(random_state=42))]) key: fit_time value: [0.03482747 0.04060721 0.06156802 0.03389192 0.033283 0.03031182 0.04303527 0.03335857 0.04897904 0.02938032] mean value: 0.03892426490783692 key: score_time value: [0.01205015 0.01392388 0.02358866 0.0120697 0.01431918 0.01200342 0.01424956 0.01427317 0.01218128 0.01201129] mean value: 0.01406702995300293 key: test_mcc value: [0.53935989 0.52295779 0.71562645 0.80909091 0.53935989 0.80909091 0.23636364 0.80909091 0.63305416 0.82572282] mean value: 0.6439717368137134 key: train_mcc value: [0.88405964 0.82054446 0.85264475 0.84171619 0.89500244 0.87301232 0.92597156 0.852022 0.89495572 0.90519967] mean value: 0.8745128753545499 key: test_accuracy value: [0.76190476 0.76190476 0.85714286 0.9047619 0.76190476 0.9047619 0.61904762 0.9047619 0.80952381 0.9047619 ] mean value: 0.819047619047619 key: train_accuracy value: [0.94179894 0.91005291 0.92592593 0.92063492 0.94708995 0.93650794 0.96296296 0.92592593 0.94708995 0.95238095] mean value: 0.937037037037037 key: test_fscore value: [0.70588235 0.73684211 0.84210526 0.9 0.70588235 0.90909091 0.63636364 0.90909091 0.8 0.9 ] mean value: 0.8045257528848859 key: train_fscore value: [0.94117647 0.90909091 0.92473118 0.9197861 0.94623656 0.93617021 0.96256684 0.92473118 0.94565217 0.95135135] mean value: 0.936149298361715 key: test_precision value: [0.85714286 0.77777778 0.88888889 0.9 0.85714286 0.90909091 0.63636364 0.90909091 0.88888889 1. ] mean value: 0.8624386724386724 key: train_precision value: [0.95652174 0.92391304 0.94505495 0.93478261 0.96703297 0.93617021 0.96774194 0.93478261 0.96666667 0.96703297] mean value: 0.9499699694037375 key: test_recall value: [0.6 0.7 0.8 0.9 0.6 0.90909091 0.63636364 0.90909091 0.72727273 0.81818182] mean value: 0.76 key: train_recall value: [0.92631579 0.89473684 0.90526316 0.90526316 0.92631579 0.93617021 0.95744681 0.91489362 0.92553191 0.93617021] mean value: 0.9228107502799552 key: test_roc_auc value: [0.75454545 0.75909091 0.85454545 0.90454545 0.75454545 0.90454545 0.61818182 0.90454545 0.81363636 0.90909091] mean value: 0.8177272727272727 key: train_roc_auc value: [0.9418813 0.91013438 0.92603583 0.92071669 0.94720045 0.93650616 0.96293393 0.92586786 0.94697648 0.95229563] mean value: 0.9370548712206047 key: test_jcc value: [0.54545455 0.58333333 0.72727273 0.81818182 0.54545455 0.83333333 0.46666667 0.83333333 0.66666667 0.81818182] mean value: 0.6837878787878788 key: train_jcc value: [0.88888889 0.83333333 0.86 0.85148515 0.89795918 0.88 0.92783505 0.86 0.89690722 0.90721649] mean value: 0.8803625317297141 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Logistic RegressionCV Model func: LogisticRegressionCV(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LogisticRegressionCV(random_state=42))]) key: fit_time value: [0.94342947 1.26570821 0.92522335 1.06781173 0.90099669 0.92243218 0.97618771 0.7888 1.1742022 0.89259076] mean value: 0.9857382297515869 key: score_time value: [0.01842904 0.01664925 0.01643705 0.016366 0.01821828 0.02390528 0.01476598 0.01480126 0.01512694 0.01856089] mean value: 0.017325997352600098 key: test_mcc value: [0.74161985 0.82275335 0.74161985 0.82572282 0.80909091 0.71818182 0.23636364 1. 0.90829511 0.67419986] mean value: 0.7477847204800199 key: train_mcc value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 key: test_accuracy value: [0.85714286 0.9047619 0.85714286 0.9047619 0.9047619 0.85714286 0.61904762 1. 0.95238095 0.80952381] mean value: 0.8666666666666667 key: train_accuracy value: [1. 0.99470899 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994708994708995 key: test_fscore value: [0.82352941 0.88888889 0.82352941 0.90909091 0.9 0.85714286 0.63636364 1. 0.95652174 0.77777778] mean value: 0.8572844631923916 key: train_fscore value: [1. 0.99470899 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994708994708995 key: test_precision value: [1. 1. 1. 0.83333333 0.9 0.9 0.63636364 1. 0.91666667 1. ] mean value: 0.9186363636363637 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.7 0.8 0.7 1. 0.9 0.81818182 0.63636364 1. 1. 0.63636364] mean value: 0.8190909090909091 key: train_recall value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 key: test_roc_auc value: [0.85 0.9 0.85 0.90909091 0.90454545 0.85909091 0.61818182 1. 0.95 0.81818182] mean value: 0.8659090909090909 key: train_roc_auc value: [1. 0.99473684 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9994736842105263 key: test_jcc value: [0.7 0.8 0.7 0.83333333 0.81818182 0.75 0.46666667 1. 0.91666667 0.63636364] mean value: 0.7621212121212121 key: train_jcc value: [1. 0.98947368 1. 1. 1. 1. 1. 1. 1. 1. ] mean value: 0.9989473684210526 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8 Model_name: Gaussian NB Model func: GaussianNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianNB())]) key: fit_time value: [0.02358484 0.00938153 0.00928259 0.00879526 0.00884533 0.00888205 0.00890446 0.00898314 0.01293993 0.01311731] mean value: 0.01127164363861084 key: score_time value: [0.01048207 0.00914145 0.00899744 0.00867033 0.00864816 0.00866389 0.00865364 0.00863838 0.01307774 0.01077056] mean value: 0.009574365615844727 key: test_mcc value: [0.35527986 0.23636364 0.60302269 0.35527986 0.21968621 0.58630197 0.13858047 0.42727273 0.45226702 0.24120908] mean value: 0.3615263506116118 key: train_mcc value: [0.43194158 0.37741808 0.40940239 0.42563559 0.39658396 0.43701355 0.42107287 0.48299607 0.34600551 0.40559385] mean value: 0.41336634360677543 key: test_accuracy value: [0.66666667 0.61904762 0.76190476 0.66666667 0.57142857 0.76190476 0.57142857 0.71428571 0.71428571 0.61904762] mean value: 0.6666666666666666 key: train_accuracy value: [0.7037037 0.66666667 0.69312169 0.7037037 0.68783069 0.70899471 0.6984127 0.74074074 0.62962963 0.68783069] mean value: 0.692063492063492 key: test_fscore value: [0.69565217 0.6 0.8 0.69565217 0.66666667 0.81481481 0.66666667 0.72727273 0.76923077 0.69230769] mean value: 0.7128263684785424 key: train_fscore value: [0.74774775 0.73191489 0.73873874 0.74311927 0.73303167 0.74418605 0.73972603 0.74871795 0.72 0.73542601] mean value: 0.7382608351962145 key: test_precision value: [0.61538462 0.6 0.66666667 0.61538462 0.52941176 0.6875 0.5625 0.72727273 0.66666667 0.6 ] mean value: 0.6270787056081174 key: train_precision value: [0.65354331 0.61428571 0.64566929 0.65853659 0.64285714 0.66115702 0.648 0.72277228 0.57692308 0.63565891] mean value: 0.6459403334606778 key: test_recall value: [0.8 0.6 1. 0.8 0.9 1. 0.81818182 0.72727273 0.90909091 0.81818182] mean value: 0.8372727272727273 key: train_recall value: [0.87368421 0.90526316 0.86315789 0.85263158 0.85263158 0.85106383 0.86170213 0.77659574 0.95744681 0.87234043] mean value: 0.8666517357222845 key: test_roc_auc value: [0.67272727 0.61818182 0.77272727 0.67272727 0.58636364 0.75 0.55909091 0.71363636 0.70454545 0.60909091] mean value: 0.6659090909090909 key: train_roc_auc value: [0.70279955 0.66539754 0.69221725 0.70291153 0.68695409 0.70974244 0.69927212 0.74092945 0.63135498 0.68880179] mean value: 0.6920380739081747 key: test_jcc value: [0.53333333 0.42857143 0.66666667 0.53333333 0.5 0.6875 0.5 0.57142857 0.625 0.52941176] mean value: 0.5575245098039215 key: train_jcc value: [0.5971223 0.57718121 0.58571429 0.59124088 0.57857143 0.59259259 0.58695652 0.59836066 0.5625 0.58156028] mean value: 0.5851800154167459 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.01514578 0.0143559 0.01509309 0.01282573 0.00945067 0.00921535 0.00978017 0.00969481 0.00936055 0.00989962] mean value: 0.011482167243957519 key: score_time value: [0.01314116 0.01436496 0.014256 0.01244879 0.00897789 0.00877929 0.00870728 0.00949979 0.00937581 0.00883293] mean value: 0.01083838939666748 key: test_mcc value: [ 0.23636364 -0.06741999 0.61818182 0.53935989 -0.04545455 0.24771685 0.03739788 0.33636364 0.55161872 0.24771685] mean value: 0.2701844747191005 key: train_mcc value: [0.44056694 0.44988241 0.45158615 0.40571724 0.47104939 0.40741573 0.48340519 0.41808615 0.40741573 0.50382186] mean value: 0.4438946787181437 key: test_accuracy value: [0.61904762 0.47619048 0.80952381 0.76190476 0.47619048 0.61904762 0.52380952 0.66666667 0.76190476 0.61904762] mean value: 0.6333333333333333 key: train_accuracy value: [0.71957672 0.72486772 0.72486772 0.6984127 0.73544974 0.7037037 0.74074074 0.70899471 0.7037037 0.75132275] mean value: 0.7211640211640211 key: test_fscore value: [0.6 0.35294118 0.8 0.70588235 0.47619048 0.6 0.58333333 0.66666667 0.73684211 0.6 ] mean value: 0.6121856110865399 key: train_fscore value: [0.71038251 0.72340426 0.71428571 0.66666667 0.73404255 0.69892473 0.72625698 0.7027027 0.69892473 0.74033149] mean value: 0.7115922343145447 key: test_precision value: [0.6 0.42857143 0.8 0.85714286 0.45454545 0.66666667 0.53846154 0.7 0.875 0.66666667] mean value: 0.6587054612054611 key: train_precision value: [0.73863636 0.7311828 0.74712644 0.75 0.74193548 0.70652174 0.76470588 0.71428571 0.70652174 0.77011494] mean value: 0.7371031097416126 key: test_recall value: [0.6 0.3 0.8 0.6 0.5 0.54545455 0.63636364 0.63636364 0.63636364 0.54545455] mean value: 0.58 key: train_recall value: [0.68421053 0.71578947 0.68421053 0.6 0.72631579 0.69148936 0.69148936 0.69148936 0.69148936 0.71276596] mean value: 0.6889249720044793 key: test_roc_auc value: [0.61818182 0.46818182 0.80909091 0.75454545 0.47727273 0.62272727 0.51818182 0.66818182 0.76818182 0.62272727] mean value: 0.6327272727272727 key: train_roc_auc value: [0.71976484 0.72491601 0.72508399 0.69893617 0.73549832 0.70363942 0.74048152 0.70890258 0.70363942 0.75111982] mean value: 0.7211982082866741 key: test_jcc value: [0.42857143 0.21428571 0.66666667 0.54545455 0.3125 0.42857143 0.41176471 0.5 0.58333333 0.42857143] mean value: 0.45197192513368983 key: train_jcc value: [0.55084746 0.56666667 0.55555556 0.5 0.57983193 0.53719008 0.57017544 0.54166667 0.53719008 0.5877193 ] mean value: 0.5526843181420478 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: K-Nearest Neighbors Model func: KNeighborsClassifier() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', KNeighborsClassifier())]) key: fit_time value: [0.01027989 0.01051307 0.00920153 0.00923085 0.00902748 0.00923491 0.01056838 0.01076412 0.01129293 0.01068258] mean value: 0.010079574584960938 key: score_time value: [0.01671553 0.01708245 0.01061392 0.01051021 0.01748228 0.01556611 0.01171184 0.01431727 0.01263571 0.01222897] mean value: 0.013886427879333496 key: test_mcc value: [-0.14545455 0.13483997 0.23373675 0.14545455 -0.39196475 -0.24771685 -0.33709993 0.33636364 0.15894099 0.05504819] mean value: -0.0057851993419990025 key: train_mcc value: [0.45024663 0.42906778 0.42923006 0.46035834 0.43243527 0.49287375 0.50280155 0.43919373 0.48199732 0.43065616] mean value: 0.45488605817766276 key: test_accuracy value: [0.42857143 0.57142857 0.61904762 0.57142857 0.33333333 0.38095238 0.33333333 0.66666667 0.57142857 0.52380952] mean value: 0.5 key: train_accuracy value: [0.72486772 0.71428571 0.71428571 0.73015873 0.71428571 0.74603175 0.75132275 0.71957672 0.74074074 0.71428571] mean value: 0.726984126984127 key: test_fscore value: [0.4 0.47058824 0.55555556 0.57142857 0.125 0.43478261 0.22222222 0.66666667 0.52631579 0.5 ] mean value: 0.447255964933647 key: train_fscore value: [0.72043011 0.70967742 0.7244898 0.73015873 0.69662921 0.73626374 0.74594595 0.71957672 0.73224044 0.69662921] mean value: 0.7212041318869982 key: test_precision value: [0.4 0.57142857 0.625 0.54545455 0.16666667 0.41666667 0.28571429 0.7 0.625 0.55555556] mean value: 0.48914862914862917 key: train_precision value: [0.73626374 0.72527473 0.7029703 0.73404255 0.74698795 0.76136364 0.75824176 0.71578947 0.75280899 0.73809524] mean value: 0.7371838358715771 key: test_recall value: [0.4 0.4 0.5 0.6 0.1 0.45454545 0.18181818 0.63636364 0.45454545 0.45454545] mean value: 0.41818181818181815 key: train_recall value: [0.70526316 0.69473684 0.74736842 0.72631579 0.65263158 0.71276596 0.73404255 0.72340426 0.71276596 0.65957447] mean value: 0.7068868980963046 key: test_roc_auc value: [0.42727273 0.56363636 0.61363636 0.57272727 0.32272727 0.37727273 0.34090909 0.66818182 0.57727273 0.52727273] mean value: 0.49909090909090903 key: train_roc_auc value: [0.724972 0.7143897 0.71410974 0.73017917 0.71461366 0.74585666 0.7512318 0.71959686 0.74059351 0.71399776] mean value: 0.7269540873460246 key: test_jcc value: [0.25 0.30769231 0.38461538 0.4 0.06666667 0.27777778 0.125 0.5 0.35714286 0.33333333] mean value: 0.30022283272283273 key: train_jcc value: [0.56302521 0.55 0.568 0.575 0.53448276 0.5826087 0.59482759 0.56198347 0.57758621 0.53448276] mean value: 0.5641996687155415 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: SVM Model func: SVC(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline:/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SVC(random_state=42))]) key: fit_time value: [0.01432991 0.01354694 0.01185727 0.01176095 0.01348209 0.01326251 0.01258087 0.01316094 0.01310778 0.01367092] mean value: 0.013076019287109376 key: score_time value: [0.01142573 0.01127124 0.00964952 0.0101819 0.01053739 0.01043487 0.01040888 0.01022005 0.01001334 0.01339149] mean value: 0.010753440856933593 key: test_mcc value: [ 0.42727273 -0.06741999 0.80909091 0.06741999 0.13762047 0.42727273 0.04545455 0.52295779 0.71818182 0.61818182] mean value: 0.37060328045303614 key: train_mcc value: [0.69399986 0.73585755 0.80972114 0.74663724 0.75702928 0.78850682 0.8636019 0.77999992 0.79896965 0.79930542] mean value: 0.7773628790574287 key: test_accuracy value: [0.71428571 0.47619048 0.9047619 0.52380952 0.57142857 0.71428571 0.52380952 0.76190476 0.85714286 0.80952381] mean value: 0.6857142857142857 key: train_accuracy value: [0.84656085 0.86772487 0.9047619 0.87301587 0.87830688 0.89417989 0.93121693 0.88888889 0.8994709 0.8994709 ] mean value: 0.8883597883597883 key: test_fscore value: [0.7 0.35294118 0.9 0.58333333 0.52631579 0.72727273 0.54545455 0.7826087 0.85714286 0.81818182] mean value: 0.6793250942981728 key: train_fscore value: [0.85128205 0.86631016 0.90425532 0.87628866 0.87700535 0.89247312 0.92896175 0.89230769 0.89839572 0.8972973 ] mean value: 0.8884577116689765 key: test_precision value: [0.7 0.42857143 0.9 0.5 0.55555556 0.72727273 0.54545455 0.75 0.9 0.81818182] mean value: 0.6825036075036075 key: train_precision value: [0.83 0.88043478 0.91397849 0.85858586 0.89130435 0.90217391 0.95505618 0.86138614 0.90322581 0.91208791] mean value: 0.8908233433616443 key: test_recall value: [0.7 0.3 0.9 0.7 0.5 0.72727273 0.54545455 0.81818182 0.81818182 0.81818182] mean value: 0.6827272727272727 key: train_recall value: [0.87368421 0.85263158 0.89473684 0.89473684 0.86315789 0.88297872 0.90425532 0.92553191 0.89361702 0.88297872] mean value: 0.8868309070548712 key: test_roc_auc value: [0.71363636 0.46818182 0.90454545 0.53181818 0.56818182 0.71363636 0.52272727 0.75909091 0.85909091 0.80909091] mean value: 0.6849999999999999 key: train_roc_auc value: [0.84641657 0.86780515 0.90481523 0.87290034 0.87838746 0.89412094 0.93107503 0.88908175 0.89944009 0.8993841 ] mean value: 0.8883426651735722 key: test_jcc value: [0.53846154 0.21428571 0.81818182 0.41176471 0.35714286 0.57142857 0.375 0.64285714 0.75 0.69230769] mean value: 0.5371430040547688 key: train_jcc value: [0.74107143 0.76415094 0.82524272 0.77981651 0.78095238 0.80582524 0.86734694 0.80555556 0.81553398 0.81372549] mean value: 0.7999221192956221 MCC on Blind test: 0.43 Accuracy on Blind test: 0.73 Model_name: MLP Model func: MLPClassifier(max_iter=500, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MLPClassifier(max_iter=500, random_state=42))]) key: fit_time value: [1.33641791 1.50435567 1.61624503 1.38623667 1.66725993 1.31514406 1.19694066 1.21551633 1.49928236 1.54568815] mean value: 1.4283086776733398 key: score_time value: [0.01264334 0.01469707 0.02935648 0.02213526 0.01919293 0.01318979 0.01264215 0.01259017 0.01352763 0.01921105] mean value: 0.016918587684631347 key: test_mcc value: [0.43007562 0.53935989 0.90829511 0.80909091 0.71818182 0.4719399 0.33028913 0.80909091 0.63305416 0.67419986] mean value: 0.6323577308640445 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.71428571 0.76190476 0.95238095 0.9047619 0.85714286 0.71428571 0.66666667 0.9047619 0.80952381 0.80952381] mean value: 0.8095238095238095 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.70588235 0.94736842 0.9 0.85714286 0.66666667 0.69565217 0.90909091 0.8 0.77777778] mean value: 0.7926247825251729 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn( [0.75 0.85714286 1. 0.9 0.81818182 0.85714286 0.66666667 0.90909091 0.88888889 1. ] mean value: 0.8647113997113997 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.6 0.9 0.9 0.9 0.54545455 0.72727273 0.90909091 0.72727273 0.63636364] mean value: 0.7445454545454545 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.70909091 0.75454545 0.95 0.90454545 0.85909091 0.72272727 0.66363636 0.90454545 0.81363636 0.81818182] mean value: 0.8099999999999999 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.54545455 0.9 0.81818182 0.75 0.5 0.53333333 0.83333333 0.66666667 0.63636364] mean value: 0.6683333333333333 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Decision Tree Model func: DecisionTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', DecisionTreeClassifier(random_state=42))]) key: fit_time value: [0.01775002 0.01362348 0.01367092 0.01293755 0.01265097 0.01258326 0.01258826 0.01409173 0.01307774 0.01271462] mean value: 0.013568854331970215 key: score_time value: [0.01175952 0.00916195 0.00893497 0.00862074 0.00865817 0.00863695 0.0087688 0.00872326 0.00884724 0.00894094] mean value: 0.009105253219604491 key: test_mcc value: [0.82275335 0.82275335 1. 0.90829511 1. 1. 0.42727273 0.90909091 0.82572282 0.80909091] mean value: 0.8524979177943448 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.9047619 1. 0.95238095 1. 1. 0.71428571 0.95238095 0.9047619 0.9047619 ] mean value: 0.9238095238095239 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.88888889 1. 0.94736842 1. 1. 0.72727273 0.95238095 0.9 0.90909091] mean value: 0.9213890787574999 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 1. 1. 1. 0.72727273 1. 1. 0.90909091] mean value: 0.9636363636363636 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.8 1. 0.9 1. 1. 0.72727273 0.90909091 0.81818182 0.90909091] mean value: 0.8863636363636364 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.9 1. 0.95 1. 1. 0.71363636 0.95454545 0.90909091 0.90454545] mean value: 0.9231818181818182 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.8 1. 0.9 1. 1. 0.57142857 0.90909091 0.81818182 0.83333333] mean value: 0.8632034632034632 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.49 Accuracy on Blind test: 0.73 Model_name: Extra Trees Model func: ExtraTreesClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreesClassifier(random_state=42))]) key: fit_time value: [0.09987569 0.10673809 0.11479974 0.10427046 0.10179067 0.12284112 0.14259434 0.10524893 0.10025811 0.09575725] mean value: 0.10941743850708008 key: score_time value: [0.02278996 0.02107644 0.02003169 0.01899815 0.01821494 0.0267005 0.02168226 0.01780367 0.01765037 0.01754332] mean value: 0.020249128341674805 key: test_mcc value: [0.82275335 0.71562645 0.90829511 0.61818182 0.82572282 0.52727273 0.23636364 0.80909091 0.71818182 0.90909091] mean value: 0.7090579546795412 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.85714286 0.95238095 0.80952381 0.9047619 0.76190476 0.61904762 0.9047619 0.85714286 0.95238095] mean value: 0.8523809523809524 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.84210526 0.94736842 0.8 0.90909091 0.76190476 0.63636364 0.90909091 0.85714286 0.95238095] mean value: 0.8504336599073441 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.88888889 1. 0.8 0.83333333 0.8 0.63636364 0.90909091 0.9 1. ] mean value: 0.8767676767676768 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.8 0.9 0.8 1. 0.72727273 0.63636364 0.90909091 0.81818182 0.90909091] mean value: 0.8300000000000001 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.85454545 0.95 0.80909091 0.90909091 0.76363636 0.61818182 0.90454545 0.85909091 0.95454545] mean value: 0.8522727272727273 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.72727273 0.9 0.66666667 0.83333333 0.61538462 0.46666667 0.83333333 0.75 0.90909091] mean value: 0.7501748251748251 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.29 Accuracy on Blind test: 0.67 Model_name: Extra Tree Model func: ExtraTreeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', ExtraTreeClassifier(random_state=42))]) key: fit_time value: [0.01057601 0.01102757 0.01102114 0.01060414 0.0107491 0.01077795 0.01005793 0.01006055 0.01025558 0.00970602] mean value: 0.010483598709106446 key: score_time value: [0.01038694 0.01018834 0.00997496 0.00985217 0.00996852 0.00999904 0.00966644 0.00945711 0.00876474 0.0086937 ] mean value: 0.009695196151733398 key: test_mcc value: [0.43007562 0.58630197 0.80909091 0.44038551 0.52295779 0.35527986 0.13762047 0.26967994 0.52727273 0.67419986] mean value: 0.47528646505235683 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.71428571 0.76190476 0.9047619 0.71428571 0.76190476 0.66666667 0.57142857 0.61904762 0.76190476 0.80952381] mean value: 0.7285714285714285 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.66666667 0.66666667 0.9 0.72727273 0.73684211 0.63157895 0.60869565 0.55555556 0.76190476 0.77777778] mean value: 0.7032960860649647 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.75 1. 0.9 0.66666667 0.77777778 0.75 0.58333333 0.71428571 0.8 1. ] mean value: 0.7942063492063492 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.5 0.9 0.8 0.7 0.54545455 0.63636364 0.45454545 0.72727273 0.63636364] mean value: 0.65 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.70909091 0.75 0.90454545 0.71818182 0.75909091 0.67272727 0.56818182 0.62727273 0.76363636 0.81818182] mean value: 0.7290909090909091 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.5 0.5 0.81818182 0.57142857 0.58333333 0.46153846 0.4375 0.38461538 0.61538462 0.63636364] mean value: 0.5508345820845821 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest Model func: RandomForestClassifier(n_estimators=1000, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(n_estimators=1000, random_state=42))]) key: fit_time value: [1.32293153 1.42478061 1.48231292 1.47654915 1.48469901 1.48599815 1.46488094 1.48408747 1.48369265 1.48373985] mean value: 1.4593672275543212 key: score_time value: [0.10207772 0.10522914 0.10460138 0.10517383 0.10500145 0.10545468 0.10445786 0.10404372 0.1047914 0.10368395] mean value: 0.10445151329040528 key: test_mcc value: [0.66332496 0.53935989 0.90829511 0.71562645 0.90909091 0.90829511 0.33028913 0.90909091 0.90829511 0.90909091] mean value: 0.7700758470872185 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.80952381 0.76190476 0.95238095 0.85714286 0.95238095 0.95238095 0.66666667 0.95238095 0.95238095 0.95238095] mean value: 0.8809523809523809 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.75 0.70588235 0.94736842 0.84210526 0.95238095 0.95652174 0.69565217 0.95238095 0.95652174 0.95238095] mean value: 0.8711194546468473 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 0.85714286 1. 0.88888889 0.90909091 0.91666667 0.66666667 1. 0.91666667 1. ] mean value: 0.9155122655122655 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.6 0.6 0.9 0.8 1. 1. 0.72727273 0.90909091 1. 0.90909091] mean value: 0.8445454545454545 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.8 0.75454545 0.95 0.85454545 0.95454545 0.95 0.66363636 0.95454545 0.95 0.95454545] mean value: 0.8786363636363637 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers. warn( [0.6 0.54545455 0.9 0.72727273 0.90909091 0.91666667 0.53333333 0.90909091 0.91666667 0.90909091] mean value: 0.7866666666666666 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: Random Forest2 Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z...05', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [1.00507212 1.5582056 1.35330391 0.88756967 0.90433073 1.32030916 1.99715137 0.90522766 0.9060688 0.88190556] mean value: 1.1719144582748413 key: score_time value: [0.14654708 0.13812089 0.14643788 0.17348289 0.18689179 0.25120091 0.13585401 0.22276163 0.19270754 0.18566346] mean value: 0.17796680927276612 key: test_mcc value: [0.74161985 0.52295779 0.82275335 0.53935989 0.71818182 0.90829511 0.13762047 0.82572282 0.71818182 0.71818182] mean value: 0.6652874733582891 key: train_mcc value: [0.95767077 0.95788064 0.96830553 0.95767077 0.95788064 0.96830907 0.98947368 0.96830907 0.96830553 0.95789003] mean value: 0.9651695734907783 key: test_accuracy value: [0.85714286 0.76190476 0.9047619 0.76190476 0.85714286 0.95238095 0.57142857 0.9047619 0.85714286 0.85714286] mean value: 0.8285714285714285 key: train_accuracy value: [0.97883598 0.97883598 0.98412698 0.97883598 0.97883598 0.98412698 0.99470899 0.98412698 0.98412698 0.97883598] mean value: 0.9825396825396825 key: test_fscore value: [0.82352941 0.73684211 0.88888889 0.70588235 0.85714286 0.95652174 0.60869565 0.9 0.85714286 0.85714286] mean value: 0.8191788721590848 key: train_fscore value: [0.97894737 0.97916667 0.98429319 0.97894737 0.97916667 0.98412698 0.99470899 0.98412698 0.98395722 0.97894737] mean value: 0.9826388814528069 key: test_precision value: [1. 0.77777778 1. 0.85714286 0.81818182 0.91666667 0.58333333 1. 0.9 0.9 ] mean value: 0.8753102453102453 key: train_precision value: [0.97894737 0.96907216 0.97916667 0.97894737 0.96907216 0.97894737 0.98947368 0.97894737 0.98924731 0.96875 ] mean value: 0.9780571466286268 key: test_recall value: [0.7 0.7 0.8 0.6 0.9 1. 0.63636364 0.81818182 0.81818182 0.81818182] mean value: 0.7790909090909091 key: train_recall value: [0.97894737 0.98947368 0.98947368 0.97894737 0.98947368 0.9893617 1. 0.9893617 0.9787234 0.9893617 ] mean value: 0.9873124300111982 key: test_roc_auc value: [0.85 0.75909091 0.9 0.75454545 0.85909091 0.95 0.56818182 0.90909091 0.85909091 0.85909091] mean value: 0.8268181818181818 key: train_roc_auc value: [0.97883539 0.9787794 0.98409854 0.97883539 0.9787794 0.98415454 0.99473684 0.98415454 0.98409854 0.97889138] mean value: 0.9825363941769317 key: test_jcc value: [0.7 0.58333333 0.8 0.54545455 0.75 0.91666667 0.4375 0.81818182 0.75 0.75 ] mean value: 0.7051136363636363 key: train_jcc value: [0.95876289 0.95918367 0.96907216 0.95876289 0.95918367 0.96875 0.98947368 0.96875 0.96842105 0.95876289] mean value: 0.9659122908523149 MCC on Blind test: 0.58 Accuracy on Blind test: 0.8 Model_name: Naive Bayes Model func: BernoulliNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BernoulliNB())]) key: fit_time value: [0.02238894 0.00916934 0.00952101 0.01010132 0.01035404 0.01048112 0.01021218 0.01014829 0.00954199 0.01029658] mean value: 0.011221480369567872 key: score_time value: [0.00930572 0.0087676 0.00974298 0.00951648 0.00947809 0.00952435 0.00950789 0.00956464 0.00935626 0.00951338] mean value: 0.009427738189697266 key: test_mcc value: [ 0.23636364 -0.06741999 0.61818182 0.53935989 -0.04545455 0.24771685 0.03739788 0.33636364 0.55161872 0.24771685] mean value: 0.2701844747191005 key: train_mcc value: [0.44056694 0.44988241 0.45158615 0.40571724 0.47104939 0.40741573 0.48340519 0.41808615 0.40741573 0.50382186] mean value: 0.4438946787181437 key: test_accuracy value: [0.61904762 0.47619048 0.80952381 0.76190476 0.47619048 0.61904762 0.52380952 0.66666667 0.76190476 0.61904762] mean value: 0.6333333333333333 key: train_accuracy value: [0.71957672 0.72486772 0.72486772 0.6984127 0.73544974 0.7037037 0.74074074 0.70899471 0.7037037 0.75132275] mean value: 0.7211640211640211 key: test_fscore value: [0.6 0.35294118 0.8 0.70588235 0.47619048 0.6 0.58333333 0.66666667 0.73684211 0.6 ] mean value: 0.6121856110865399 key: train_fscore value: [0.71038251 0.72340426 0.71428571 0.66666667 0.73404255 0.69892473 0.72625698 0.7027027 0.69892473 0.74033149] mean value: 0.7115922343145447 key: test_precision value: [0.6 0.42857143 0.8 0.85714286 0.45454545 0.66666667 0.53846154 0.7 0.875 0.66666667] mean value: 0.6587054612054611 key: train_precision value: [0.73863636 0.7311828 0.74712644 0.75 0.74193548 0.70652174 0.76470588 0.71428571 0.70652174 0.77011494] mean value: 0.7371031097416126 key: test_recall value: [0.6 0.3 0.8 0.6 0.5 0.54545455 0.63636364 0.63636364 0.63636364 0.54545455] mean value: 0.58 key: train_recall value: [0.68421053 0.71578947 0.68421053 0.6 0.72631579 0.69148936 0.69148936 0.69148936 0.69148936 0.71276596] mean value: 0.6889249720044793 key: test_roc_auc value: [0.61818182 0.46818182 0.80909091 0.75454545 0.47727273 0.62272727 0.51818182 0.66818182 0.76818182 0.62272727] mean value: 0.6327272727272727 key: train_roc_auc value: [0.71976484 0.72491601 0.72508399 0.69893617 0.73549832 0.70363942 0.74048152 0.70890258 0.70363942 0.75111982] mean value: 0.7211982082866741 key: test_jcc value: [0.42857143 0.21428571 0.66666667 0.54545455 0.3125 0.42857143 0.41176471 0.5 0.58333333 0.42857143] mean value: 0.45197192513368983 key: train_jcc value: [0.55084746 0.56666667 0.55555556 0.5 0.57983193 0.53719008 0.57017544 0.54166667 0.53719008 0.5877193 ] mean value: 0.5526843181420478 MCC on Blind test: 0.44 Accuracy on Blind test: 0.73 Model_name: XGBoost Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=None, enable_categorical=False, gamma=None, gpu_id=None, importance_type=None, interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'Z... interaction_constraints=None, learning_rate=None, max_delta_step=None, max_depth=None, min_child_weight=None, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=42, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, subsample=None, tree_method=None, use_label_encoder=False, validate_parameters=None, verbosity=0))]) key: fit_time value: [2.61034989 5.89738679 5.05655122 4.04000521 4.81949782 4.60966992 4.59260941 4.32805228 4.22248673 2.80192494] mean value: 4.297853422164917 key: score_time value: [0.02688909 0.02042317 0.02070355 0.01861191 0.02235937 0.0204308 0.02161312 0.02076817 0.02501345 0.0135026 ] mean value: 0.021031522750854494 key: test_mcc value: [0.82275335 0.82275335 0.82275335 1. 1. 0.90829511 0.62641448 1. 1. 0.80909091] mean value: 0.881206055224815 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.9047619 0.9047619 1. 1. 0.95238095 0.80952381 1. 1. 0.9047619 ] mean value: 0.9380952380952381 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.88888889 0.88888889 1. 1. 0.95652174 0.83333333 1. 1. 0.90909091] mean value: 0.9365612648221344 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 1. 1. 0.91666667 0.76923077 1. 1. 0.90909091] mean value: 0.9594988344988344 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.8 0.8 1. 1. 1. 0.90909091 1. 1. 0.90909091] mean value: 0.9218181818181819 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.9 0.9 1. 1. 0.95 0.80454545 1. 1. 0.90454545] mean value: 0.9359090909090909 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.8 0.8 1. 1. 0.91666667 0.71428571 1. 1. 0.83333333] mean value: 0.8864285714285715 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: LDA Model func: LinearDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', LinearDiscriminantAnalysis())]) key: fit_time value: [0.06827331 0.07484531 0.05876684 0.03986669 0.06439209 0.07714224 0.06403208 0.0616889 0.03780341 0.04415965] mean value: 0.0590970516204834 key: score_time value: [0.02553988 0.02265811 0.02189136 0.01328444 0.01226473 0.01283622 0.02830124 0.02277637 0.01287508 0.0122354 ] mean value: 0.01846628189086914 key: test_mcc value: [0.66332496 0.45226702 0.43007562 0.63305416 0.44038551 0.71818182 0.60302269 0.71562645 0.71818182 0.67419986] mean value: 0.6048319896654357 key: train_mcc value: [0.98947251 0.93650616 0.97883539 0.98947368 0.97883539 0.95767077 0.94755736 0.93650616 0.94755736 0.96873621] mean value: 0.9631150997764043 key: test_accuracy value: [0.80952381 0.71428571 0.71428571 0.80952381 0.71428571 0.85714286 0.76190476 0.85714286 0.85714286 0.80952381] mean value: 0.7904761904761904 key: train_accuracy value: [0.99470899 0.96825397 0.98941799 0.99470899 0.98941799 0.97883598 0.97354497 0.96825397 0.97354497 0.98412698] mean value: 0.9814814814814814 key: test_fscore value: [0.75 0.625 0.66666667 0.81818182 0.72727273 0.85714286 0.70588235 0.86956522 0.85714286 0.77777778] mean value: 0.7654632274517185 key: train_fscore value: [0.9947644 0.96842105 0.98947368 0.99470899 0.98947368 0.9787234 0.97297297 0.96808511 0.97297297 0.98378378] mean value: 0.9813380054035413 key: test_precision value: [1. 0.83333333 0.75 0.75 0.66666667 0.9 1. 0.83333333 0.9 1. ] mean value: 0.8633333333333333 key: train_precision value: [0.98958333 0.96842105 0.98947368 1. 0.98947368 0.9787234 0.98901099 0.96808511 0.98901099 1. ] mean value: 0.9861782243046241 key: test_recall value: [0.6 0.5 0.6 0.9 0.8 0.81818182 0.54545455 0.90909091 0.81818182 0.63636364] mean value: 0.7127272727272728 key: train_recall value: [1. 0.96842105 0.98947368 0.98947368 0.98947368 0.9787234 0.95744681 0.96808511 0.95744681 0.96808511] mean value: 0.9766629339305711 key: test_roc_auc value: [0.8 0.70454545 0.70909091 0.81363636 0.71818182 0.85909091 0.77272727 0.85454545 0.85909091 0.81818182] mean value: 0.7909090909090909 key: train_roc_auc value: [0.99468085 0.96825308 0.98941769 0.99473684 0.98941769 0.97883539 0.97346025 0.96825308 0.97346025 0.98404255] mean value: 0.9814557670772677 key: test_jcc value: [0.6 0.45454545 0.5 0.69230769 0.57142857 0.75 0.54545455 0.76923077 0.75 0.63636364] mean value: 0.6269330669330669 key: train_jcc value: [0.98958333 0.93877551 0.97916667 0.98947368 0.97916667 0.95833333 0.94736842 0.93814433 0.94736842 0.96808511] mean value: 0.9635465472799757 MCC on Blind test: 0.11 Accuracy on Blind test: 0.53 Model_name: Multinomial Model func: MultinomialNB() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', MultinomialNB())]) key: fit_time value: [0.02180147 0.01045227 0.01006985 0.0100987 0.00979686 0.00981426 0.00987458 0.01014495 0.01003695 0.01014376] mean value: 0.011223363876342773 key: score_time value: [0.00946331 0.00991654 0.00950646 0.00958514 0.00926518 0.00935149 0.00945377 0.00949931 0.00967503 0.00960541] mean value: 0.00953216552734375 key: test_mcc value: [0.14545455 0.13762047 0.63305416 0.33028913 0.30914104 0.45226702 0.13762047 0.61818182 0.71818182 0.23373675] mean value: 0.3715547220669732 key: train_mcc value: [0.41147388 0.40281841 0.38806379 0.41147388 0.39005594 0.40105488 0.40396007 0.42240682 0.43243527 0.41239882] mean value: 0.40761417768557445 key: test_accuracy value: [0.57142857 0.57142857 0.80952381 0.66666667 0.61904762 0.71428571 0.57142857 0.80952381 0.85714286 0.61904762] mean value: 0.680952380952381 key: train_accuracy value: [0.7037037 0.6984127 0.69312169 0.7037037 0.69312169 0.6984127 0.6984127 0.70899471 0.71428571 0.7037037 ] mean value: 0.7015873015873015 key: test_fscore value: [0.57142857 0.52631579 0.81818182 0.63157895 0.69230769 0.76923077 0.60869565 0.81818182 0.85714286 0.66666667] mean value: 0.6959730582156212 key: train_fscore value: [0.7254902 0.72463768 0.71 0.7254902 0.71568627 0.71641791 0.72195122 0.72636816 0.73 0.72277228] mean value: 0.7218813914217747 key: test_precision value: [0.54545455 0.55555556 0.75 0.66666667 0.5625 0.66666667 0.58333333 0.81818182 0.9 0.61538462] mean value: 0.6663743201243202 key: train_precision value: [0.67889908 0.66964286 0.67619048 0.67889908 0.66972477 0.6728972 0.66666667 0.68224299 0.68867925 0.67592593] mean value: 0.6759768293904649 key: test_recall value: [0.6 0.5 0.9 0.6 0.9 0.90909091 0.63636364 0.81818182 0.81818182 0.72727273] mean value: 0.740909090909091 key: train_recall value: [0.77894737 0.78947368 0.74736842 0.77894737 0.76842105 0.76595745 0.78723404 0.77659574 0.77659574 0.77659574] mean value: 0.7746136618141097 key: test_roc_auc value: [0.57272727 0.56818182 0.81363636 0.66363636 0.63181818 0.70454545 0.56818182 0.80909091 0.85909091 0.61363636] mean value: 0.6804545454545454 key: train_roc_auc value: [0.70330347 0.69792833 0.69283315 0.70330347 0.69272116 0.6987682 0.69888018 0.7093505 0.71461366 0.70408735] mean value: 0.7015789473684211 key: test_jcc value: [0.4 0.35714286 0.69230769 0.46153846 0.52941176 0.625 0.4375 0.69230769 0.75 0.5 ] mean value: 0.5445208468002586 key: train_jcc value: [0.56923077 0.56818182 0.5503876 0.56923077 0.55725191 0.55813953 0.5648855 0.5703125 0.57480315 0.56589147] mean value: 0.5648315015480971 MCC on Blind test: 0.12 Accuracy on Blind test: 0.6 Model_name: Passive Aggresive Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', PassiveAggressiveClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01307464 0.01691055 0.01696324 0.01449084 0.018785 0.01570868 0.01574159 0.01785326 0.01694202 0.01764059] mean value: 0.01641104221343994 key: score_time value: [0.0097959 0.0115664 0.0114677 0.01176572 0.01179075 0.01177979 0.01176739 0.01171255 0.01223779 0.01183796] mean value: 0.01157219409942627 key: test_mcc value: [0.43007562 0.46249729 0.71562645 0.80909091 0.36244122 0.66332496 0.13762047 0.67419986 0.80909091 0.60302269] mean value: 0.5666990369115905 key: train_mcc value: [0.88757469 0.63581076 0.96830553 0.76193012 0.76291765 0.67012598 0.88607273 0.71085804 0.94755736 0.91860433] mean value: 0.8149757191296667 key: test_accuracy value: [0.71428571 0.66666667 0.85714286 0.9047619 0.66666667 0.80952381 0.57142857 0.80952381 0.9047619 0.76190476] mean value: 0.7666666666666666 key: train_accuracy value: [0.94179894 0.78835979 0.98412698 0.87830688 0.86772487 0.80952381 0.94179894 0.83597884 0.97354497 0.95767196] mean value: 0.8978835978835978 key: test_fscore value: [0.66666667 0.74074074 0.84210526 0.9 0.53333333 0.84615385 0.60869565 0.77777778 0.90909091 0.70588235] mean value: 0.7530446542036258 key: train_fscore value: [0.94472362 0.82608696 0.98429319 0.87150838 0.84848485 0.83928571 0.94358974 0.80254777 0.97297297 0.95555556] mean value: 0.898904875380721 key: test_precision value: [0.75 0.58823529 0.88888889 0.9 0.8 0.73333333 0.58333333 1. 0.90909091 1. ] mean value: 0.8152881758764112 key: train_precision value: [0.90384615 0.7037037 0.97916667 0.92857143 1. 0.72307692 0.91089109 1. 0.98901099 1. ] mean value: 0.9138266953984776 key: test_recall value: [0.6 1. 0.8 0.9 0.4 1. 0.63636364 0.63636364 0.90909091 0.54545455] mean value: 0.7427272727272727 key: train_recall value: [0.98947368 1. 0.98947368 0.82105263 0.73684211 1. 0.9787234 0.67021277 0.95744681 0.91489362] mean value: 0.9058118701007839 key: test_roc_auc value: [0.70909091 0.68181818 0.85454545 0.90454545 0.65454545 0.8 0.56818182 0.81818182 0.90454545 0.77272727] mean value: 0.7668181818181818 key: train_roc_auc value: [0.94154535 0.78723404 0.98409854 0.87861142 0.86842105 0.81052632 0.94199328 0.83510638 0.97346025 0.95744681] mean value: 0.8978443449048152 key: test_jcc value: [0.5 0.58823529 0.72727273 0.81818182 0.36363636 0.73333333 0.4375 0.63636364 0.83333333 0.54545455] mean value: 0.6183311051693404 key: train_jcc value: [0.8952381 0.7037037 0.96907216 0.77227723 0.73684211 0.72307692 0.89320388 0.67021277 0.94736842 0.91489362] mean value: 0.8225888907479606 MCC on Blind test: 0.61 Accuracy on Blind test: 0.8 Model_name: Stochastic GDescent Model func: SGDClassifier(n_jobs=10, random_state=42) List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', SGDClassifier(n_jobs=10, random_state=42))]) key: fit_time value: [0.01661968 0.01596975 0.01542592 0.01607823 0.01489186 0.01536465 0.01502705 0.01687884 0.03487134 0.02883863] mean value: 0.018996596336364746 key: score_time value: [0.01196527 0.01202655 0.01178312 0.01207161 0.01201534 0.01186323 0.01198721 0.01221204 0.02217579 0.02880287] mean value: 0.014690303802490234 key: test_mcc value: [0.53935989 0. 0.74161985 0.90909091 0.53300179 0.50874702 0.03739788 0.90909091 0.50874702 0.26593594] mean value: 0.49529912072758353 key: train_mcc value: [0.94713854 0.47421554 0.89436546 0.87787601 0.51260702 0.64546146 0.84944554 0.83355494 0.38837405 0.37937244] mean value: 0.6802411002611571 key: test_accuracy value: [0.76190476 0.52380952 0.85714286 0.95238095 0.71428571 0.71428571 0.52380952 0.95238095 0.71428571 0.61904762] mean value: 0.7333333333333333 key: train_accuracy value: [0.97354497 0.68253968 0.94708995 0.93650794 0.70899471 0.79365079 0.92063492 0.91005291 0.62962963 0.62433862] mean value: 0.8126984126984127 key: test_fscore value: [0.70588235 0. 0.82352941 0.95238095 0.76923077 0.78571429 0.58333333 0.95238095 0.78571429 0.71428571] mean value: 0.7072452057746176 key: train_fscore value: [0.97382199 0.53846154 0.94791667 0.94 0.7755102 0.82819383 0.92537313 0.9005848 0.72868217 0.72586873] mean value: 0.8284413057399109 key: test_precision value: [0.85714286 0. 1. 0.90909091 0.625 0.64705882 0.53846154 1. 0.64705882 0.58823529] mean value: 0.6812048245871776 key: train_precision value: [0.96875 1. 0.93814433 0.8952381 0.63333333 0.70676692 0.86915888 1. 0.57317073 0.56969697] mean value: 0.8154259255670528 key: test_recall value: [0.6 0. 0.7 1. 1. 1. 0.63636364 0.90909091 1. 0.90909091] mean value: 0.7754545454545454 key: train_recall value: [0.97894737 0.36842105 0.95789474 0.98947368 1. 1. 0.9893617 0.81914894 1. 1. ] mean value: 0.9103247480403136 key: test_roc_auc value: [0.75454545 0.5 0.85 0.95454545 0.72727273 0.7 0.51818182 0.95454545 0.7 0.60454545] mean value: 0.7263636363636363 key: train_roc_auc value: [0.97351624 0.68421053 0.94703247 0.9362262 0.70744681 0.79473684 0.92099664 0.90957447 0.63157895 0.62631579] mean value: 0.8131634938409854 key: test_jcc value: [0.54545455 0. 0.7 0.90909091 0.625 0.64705882 0.41176471 0.90909091 0.64705882 0.55555556] mean value: 0.5950074272133096 key: train_jcc value: [0.94897959 0.36842105 0.9009901 0.88679245 0.63333333 0.70676692 0.86111111 0.81914894 0.57317073 0.56969697] mean value: 0.726841119562058 MCC on Blind test: 0.67 Accuracy on Blind test: 0.8 Model_name: AdaBoost Classifier Model func: AdaBoostClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', AdaBoostClassifier(random_state=42))]) key: fit_time value: [0.15451217 0.15445352 0.2040379 0.15504122 0.15569568 0.15498495 0.15341592 0.16030359 0.15658045 0.15535378] mean value: 0.1604379177093506 key: score_time value: [0.021106 0.02071619 0.02216125 0.0211637 0.02109528 0.021137 0.02115512 0.02139163 0.02109694 0.02112269] mean value: 0.02121458053588867 key: test_mcc value: [0.90829511 0.90829511 0.90829511 0.82275335 0.90829511 0.71562645 0.52295779 1. 0.90829511 0.90909091] mean value: 0.8511904027211744 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.95238095 0.95238095 0.95238095 0.9047619 0.95238095 0.85714286 0.76190476 1. 0.95238095 0.95238095] mean value: 0.9238095238095237 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.94736842 0.94736842 0.94736842 0.88888889 0.94736842 0.86956522 0.7826087 1. 0.95652174 0.95238095] mean value: 0.9239439177654281 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 1. 1. 0.83333333 0.75 1. 0.91666667 1. ] mean value: 0.95 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.9 0.9 0.9 0.8 0.9 0.90909091 0.81818182 1. 1. 0.90909091] mean value: 0.9036363636363637 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.95 0.95 0.95 0.9 0.95 0.85454545 0.75909091 1. 0.95 0.95454545] mean value: 0.9218181818181819 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.9 0.9 0.9 0.8 0.9 0.76923077 0.64285714 1. 0.91666667 0.90909091] mean value: 0.8637845487845488 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.87 Accuracy on Blind test: 0.93 Model_name: Bagging Classifier Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))]) key: fit_time value: [0.0783937 0.06188798 0.0702672 0.065274 0.08460093 0.07100463 0.07129097 0.0830369 0.0656898 0.09316111] mean value: 0.07446072101593018 key: score_time value: [0.02479196 0.02323222 0.02332997 0.02291703 0.02248645 0.02907228 0.0261395 0.02889991 0.02425504 0.03242946] mean value: 0.02575538158416748 key: test_mcc value: [0.82275335 0.82275335 1. 0.90829511 0.90829511 0.80909091 0.62641448 1. 1. 0.80909091] mean value: 0.8706693216360863 key: train_mcc value: [0.98947368 1. 0.98947368 0.98947368 0.97905701 1. 1. 0.98947251 0.97905701 0.96830907] mean value: 0.9884316657513018 key: test_accuracy value: [0.9047619 0.9047619 1. 0.95238095 0.95238095 0.9047619 0.80952381 1. 1. 0.9047619 ] mean value: 0.9333333333333333 key: train_accuracy value: [0.99470899 1. 0.99470899 0.99470899 0.98941799 1. 1. 0.99470899 0.98941799 0.98412698] mean value: 0.9941798941798942 key: test_fscore value: [0.88888889 0.88888889 1. 0.94736842 0.94736842 0.90909091 0.83333333 1. 1. 0.90909091] mean value: 0.9324029771398192 key: train_fscore value: [0.99470899 1. 0.99470899 0.99470899 0.9893617 1. 1. 0.99465241 0.98947368 0.98412698] mean value: 0.9941741761009266 key: test_precision value: [1. 1. 1. 1. 1. 0.90909091 0.76923077 1. 1. 0.90909091] mean value: 0.9587412587412587 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 0.97916667 0.97894737] mean value: 0.995811403508772 key: test_recall value: [0.8 0.8 1. 0.9 0.9 0.90909091 0.90909091 1. 1. 0.90909091] mean value: 0.9127272727272727 key: train_recall value: [0.98947368 1. 0.98947368 0.98947368 0.97894737 1. 1. 0.9893617 1. 0.9893617 ] mean value: 0.9926091825307951 key: test_roc_auc value: [0.9 0.9 1. 0.95 0.95 0.90454545 0.80454545 1. 1. 0.90454545] mean value: 0.9313636363636364 key: train_roc_auc value: [0.99473684 1. 0.99473684 0.99473684 0.98947368 1. 1. 0.99468085 0.98947368 0.98415454] mean value: 0.9941993281075028 key: test_jcc value: [0.8 0.8 1. 0.9 0.9 0.83333333 0.71428571 1. 1. 0.83333333] mean value: 0.8780952380952382 key: train_jcc value: [0.98947368 1. 0.98947368 0.98947368 0.97894737 1. 1. 0.9893617 0.97916667 0.96875 ] mean value: 0.9884646789846958 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Gaussian Process Model func: GaussianProcessClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GaussianProcessClassifier(random_state=42))]) key: fit_time value: [0.09878612 0.09960461 0.09258199 0.09417605 0.09641194 0.08949471 0.09223723 0.11594605 0.1179924 0.12251711] mean value: 0.10197482109069825 key: score_time value: [0.03881836 0.03061247 0.03737235 0.03541827 0.03377557 0.02945137 0.03179479 0.0367341 0.0444777 0.0131073 ] mean value: 0.03315622806549072 key: test_mcc value: [ 0.23373675 0.62641448 0.74161985 0.33636364 0.63305416 0.42727273 -0.03739788 0.82572282 0.4719399 0.67419986] mean value: 0.49329263185333116 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.61904762 0.80952381 0.85714286 0.66666667 0.80952381 0.71428571 0.47619048 0.9047619 0.71428571 0.80952381] mean value: 0.7380952380952381 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.55555556 0.77777778 0.82352941 0.66666667 0.81818182 0.72727273 0.42105263 0.9 0.66666667 0.77777778] mean value: 0.7134481033242643 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.625 0.875 1. 0.63636364 0.75 0.72727273 0.5 1. 0.85714286 1. ] mean value: 0.797077922077922 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.5 0.7 0.7 0.7 0.9 0.72727273 0.36363636 0.81818182 0.54545455 0.63636364] mean value: 0.6590909090909091 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.61363636 0.80454545 0.85 0.66818182 0.81363636 0.71363636 0.48181818 0.90909091 0.72272727 0.81818182] mean value: 0.7395454545454545 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.38461538 0.63636364 0.7 0.5 0.69230769 0.57142857 0.26666667 0.81818182 0.5 0.63636364] mean value: 0.5705927405927406 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.17 Accuracy on Blind test: 0.6 Model_name: Gradient Boosting Model func: GradientBoostingClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', GradientBoostingClassifier(random_state=42))]) key: fit_time value: [0.4725976 0.47249556 0.44347739 0.43946433 0.55229735 0.44706702 0.44039941 0.434376 0.51444864 0.48032069] mean value: 0.4696943998336792 key: score_time value: [0.01501417 0.01269388 0.0131402 0.0126729 0.01365566 0.01293302 0.01276636 0.01274252 0.01311779 0.0128448 ] mean value: 0.013158130645751952 key: test_mcc value: [0.82275335 0.90829511 1. 0.90829511 1. 1. 0.62641448 1. 1. 1. ] mean value: 0.9265758046971604 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.9047619 0.95238095 1. 0.95238095 1. 1. 0.80952381 1. 1. 1. ] mean value: 0.9619047619047619 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.88888889 0.94736842 1. 0.94736842 1. 1. 0.83333333 1. 1. 1. ] mean value: 0.9616959064327486 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [1. 1. 1. 1. 1. 1. 0.76923077 1. 1. 1. ] mean value: 0.9769230769230769 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [0.8 0.9 1. 0.9 1. 1. 0.90909091 1. 1. 1. ] mean value: 0.9509090909090909 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.9 0.95 1. 0.95 1. 1. 0.80454545 1. 1. 1. ] mean value: 0.9604545454545454 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.8 0.9 1. 0.9 1. 1. 0.71428571 1. 1. 1. ] mean value: 0.9314285714285715 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: QDA Model func: QuadraticDiscriminantAnalysis() List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear warnings.warn("Variables are collinear") Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', QuadraticDiscriminantAnalysis())]) key: fit_time value: [0.03971577 0.05372071 0.04902482 0.0534029 0.05441594 0.0602479 0.05954766 0.05021262 0.05034852 0.06813431] mean value: 0.05387711524963379 key: score_time value: [0.02256966 0.02043724 0.01937819 0.01983213 0.01934934 0.01934004 0.01987314 0.02205038 0.07356358 0.0410378 ] mean value: 0.027743148803710937 key: test_mcc value: [0.60302269 0.82572282 0.67419986 0.60302269 0.60302269 0.66332496 0.50874702 0.66332496 0.74161985 0.82275335] mean value: 0.6708760888902932 key: train_mcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_accuracy value: [0.76190476 0.9047619 0.80952381 0.76190476 0.76190476 0.80952381 0.71428571 0.80952381 0.85714286 0.9047619 ] mean value: 0.8095238095238095 key: train_accuracy value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_fscore value: [0.8 0.90909091 0.83333333 0.8 0.8 0.84615385 0.78571429 0.84615385 0.88 0.91666667] mean value: 0.8417112887112888 key: train_fscore value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_precision value: [0.66666667 0.83333333 0.71428571 0.66666667 0.66666667 0.73333333 0.64705882 0.73333333 0.78571429 0.84615385] mean value: 0.7293212669683258 key: train_precision value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: train_recall value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_roc_auc value: [0.77272727 0.90909091 0.81818182 0.77272727 0.77272727 0.8 0.7 0.8 0.85 0.9 ] mean value: 0.8095454545454546 key: train_roc_auc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 key: test_jcc value: [0.66666667 0.83333333 0.71428571 0.66666667 0.66666667 0.73333333 0.64705882 0.73333333 0.78571429 0.84615385] mean value: 0.7293212669683258 key: train_jcc value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] mean value: 1.0 MCC on Blind test: 0.0 Accuracy on Blind test: 0.6 Model_name: Ridge Classifier Model func: RidgeClassifier(random_state=42) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifier(random_state=42))]) key: fit_time value: [0.03402305 0.05161929 0.07370734 0.047405 0.04301023 0.04708791 0.04369712 0.04251552 0.04322314 0.04266644] mean value: 0.046895503997802734 key: score_time value: [0.03400087 0.05133557 0.03652644 0.0336163 0.03723359 0.03171778 0.0283761 0.03356314 0.02800155 0.03366947] mean value: 0.03480408191680908 key: test_mcc value: [0.74161985 0.62641448 0.74161985 0.90909091 0.71818182 0.71818182 0.23636364 1. 0.80909091 0.82572282] mean value: 0.732628609547866 key: train_mcc value: [0.95767077 0.92597156 0.96830907 0.94714446 0.96830553 0.93672304 0.95767077 0.95767077 0.95788064 0.96830907] mean value: 0.9545655686835185 key: test_accuracy value: [0.85714286 0.80952381 0.85714286 0.95238095 0.85714286 0.85714286 0.61904762 1. 0.9047619 0.9047619 ] mean value: 0.8619047619047618 key: train_accuracy value: [0.97883598 0.96296296 0.98412698 0.97354497 0.98412698 0.96825397 0.97883598 0.97883598 0.97883598 0.98412698] mean value: 0.9772486772486773 key: test_fscore value: [0.82352941 0.77777778 0.82352941 0.95238095 0.85714286 0.85714286 0.63636364 1. 0.90909091 0.9 ] mean value: 0.8536957813428402 key: train_fscore value: [0.97894737 0.96335079 0.98412698 0.97354497 0.98429319 0.96842105 0.9787234 0.9787234 0.97849462 0.98412698] mean value: 0.9772752774075717 key: test_precision value: [1. 0.875 1. 0.90909091 0.81818182 0.9 0.63636364 1. 0.90909091 1. ] mean value: 0.9047727272727273 key: train_precision value: [0.97894737 0.95833333 0.9893617 0.9787234 0.97916667 0.95833333 0.9787234 0.9787234 0.98913043 0.97894737] mean value: 0.9768390419851665 key: test_recall value: [0.7 0.7 0.7 1. 0.9 0.81818182 0.63636364 1. 0.90909091 0.81818182] mean value: 0.8181818181818181 key: train_recall value: [0.97894737 0.96842105 0.97894737 0.96842105 0.98947368 0.9787234 0.9787234 0.9787234 0.96808511 0.9893617 ] mean value:/home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:188: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True) /home/tanu/git/LSHTM_analysis/scripts/ml/./pnca_sl.py:191: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True) 0.9777827547592385 key: test_roc_auc value: [0.85 0.80454545 0.85 0.95454545 0.85909091 0.85909091 0.61818182 1. 0.90454545 0.90909091] mean value: 0.8609090909090908 key: train_roc_auc value: [0.97883539 0.96293393 0.98415454 0.97357223 0.98409854 0.96830907 0.97883539 0.97883539 0.9787794 0.98415454] mean value: 0.9772508398656214 key: test_jcc value: [0.7 0.63636364 0.7 0.90909091 0.75 0.75 0.46666667 1. 0.83333333 0.81818182] mean value: 0.7563636363636363 key: train_jcc value: [0.95876289 0.92929293 0.96875 0.94845361 0.96907216 0.93877551 0.95833333 0.95833333 0.95789474 0.96875 ] mean value: 0.9556418502799597 MCC on Blind test: 0.72 Accuracy on Blind test: 0.87 Model_name: Ridge ClassifierCV Model func: RidgeClassifierCV(cv=10) List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5, n_estimators=1000, n_jobs=10, oob_score=True, random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, gamma=0, gpu_id=-1, importance_type=None, interaction_constraints='', learning_rate=0.300000012, max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=12, num_parallel_tree=1, predictor='auto', random_state=42, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact', use_label_encoder=False, validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))] Running model pipeline: Pipeline(steps=[('prep', ColumnTransformer(remainder='passthrough', transformers=[('num', MinMaxScaler(), Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'rsa', 'kd_values', ... 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'], dtype='object', length=166)), ('cat', OneHotEncoder(), Index(['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'], dtype='object'))])), ('model', RidgeClassifierCV(cv=10))]) key: fit_time value: [0.38085413 0.38319826 0.41932321 0.38498354 0.38405871 0.38967252 0.35881495 0.39832497 0.41211724 0.37258697] mean value: 0.3883934497833252 key: score_time value: [0.03946233 0.02718139 0.03768969 0.03861022 0.03152013 0.03818846 0.03179121 0.0374887 0.02729797 0.03727818] mean value: 0.0346508264541626 key: test_mcc value: [0.74161985 0.74161985 0.66332496 0.90909091 0.71818182 0.80909091 0.23636364 1. 0.90829511 0.74795759] mean value: 0.7475544626453499 key: train_mcc value: [0.95767077 0.95767077 0.96830907 0.94714446 0.94714446 0.95767077 0.95767077 0.95767077 0.96830553 0.95767077] mean value: 0.9576928147788344 key: test_accuracy value: [0.85714286 0.85714286 0.80952381 0.95238095 0.85714286 0.9047619 0.61904762 1. 0.95238095 0.85714286] mean value: 0.8666666666666667 key: train_accuracy value: [0.97883598 0.97883598 0.98412698 0.97354497 0.97354497 0.97883598 0.97883598 0.97883598 0.98412698 0.97883598] mean value: 0.9788359788359788 key: test_fscore value: [0.82352941 0.82352941 0.75 0.95238095 0.85714286 0.90909091 0.63636364 1. 0.95652174 0.84210526] mean value: 0.8550664180796096 key: train_fscore value: [0.97894737 0.97894737 0.98412698 0.97354497 0.97354497 0.9787234 0.9787234 0.9787234 0.98395722 0.9787234 ] mean value: 0.978796250433165 key: test_precision value: [1. 1. 1. 0.90909091 0.81818182 0.90909091 0.63636364 1. 0.91666667 1. ] mean value: 0.918939393939394 key: train_precision value: [0.97894737 0.97894737 0.9893617 0.9787234 0.9787234 0.9787234 0.9787234 0.9787234 0.98924731 0.9787234 ] mean value: 0.9808844176329636 key: test_recall value: [0.7 0.7 0.6 1. 0.9 0.90909091 0.63636364 1. 1. 0.72727273] mean value: 0.8172727272727273 key: train_recall value: [0.97894737 0.97894737 0.97894737 0.96842105 0.96842105 0.9787234 0.9787234 0.9787234 0.9787234 0.9787234 ] mean value: 0.9767301231802912 key: test_roc_auc value: [0.85 0.85 0.8 0.95454545 0.85909091 0.90454545 0.61818182 1. 0.95 0.86363636] mean value: 0.865 key: train_roc_auc value: [0.97883539 0.97883539 0.98415454 0.97357223 0.97357223 0.97883539 0.97883539 0.97883539 0.98409854 0.97883539] mean value: 0.9788409854423291 key: test_jcc value: [0.7 0.7 0.6 0.90909091 0.75 0.83333333 0.46666667 1. 0.91666667 0.72727273] mean value: 0.7603030303030303 key: train_jcc value: [0.95876289 0.95876289 0.96875 0.94845361 0.94845361 0.95833333 0.95833333 0.95833333 0.96842105 0.95833333] mean value: 0.9584937375655634 MCC on Blind test: 0.6 Accuracy on Blind test: 0.8